mirror of
https://github.com/Relintai/pandemonium_engine.git
synced 2024-12-22 11:56:49 +01:00
125 lines
7.8 KiB
XML
125 lines
7.8 KiB
XML
<?xml version="1.0" encoding="UTF-8" ?>
|
|
<class name="RegEx" inherits="Reference" version="4.1">
|
|
<brief_description>
|
|
Class for searching text for patterns using regular expressions.
|
|
</brief_description>
|
|
<description>
|
|
A regular expression (or regex) is a compact language that can be used to recognise strings that follow a specific pattern, such as URLs, email addresses, complete sentences, etc. For instance, a regex of [code]ab[0-9][/code] would find any string that is [code]ab[/code] followed by any number from [code]0[/code] to [code]9[/code]. For a more in-depth look, you can easily find various tutorials and detailed explanations on the Internet.
|
|
To begin, the RegEx object needs to be compiled with the search pattern using [method compile] before it can be used.
|
|
[codeblock]
|
|
var regex = RegEx.new()
|
|
regex.compile("\\w-(\\d+)")
|
|
[/codeblock]
|
|
The search pattern must be escaped first for GDScript before it is escaped for the expression. For example, [code]compile("\\d+")[/code] would be read by RegEx as [code]\d+[/code]. Similarly, [code]compile("\"(?:\\\\.|[^\"])*\"")[/code] would be read as [code]"(?:\\.|[^"])*"[/code].
|
|
Using [method search], you can find the pattern within the given text. If a pattern is found, [RegExMatch] is returned and you can retrieve details of the results using methods such as [method RegExMatch.get_string] and [method RegExMatch.get_start].
|
|
[codeblock]
|
|
var regex = RegEx.new()
|
|
regex.compile("\\w-(\\d+)")
|
|
var result = regex.search("abc n-0123")
|
|
if result:
|
|
print(result.get_string()) # Would print n-0123
|
|
[/codeblock]
|
|
The results of capturing groups [code]()[/code] can be retrieved by passing the group number to the various methods in [RegExMatch]. Group 0 is the default and will always refer to the entire pattern. In the above example, calling [code]result.get_string(1)[/code] would give you [code]0123[/code].
|
|
This version of RegEx also supports named capturing groups, and the names can be used to retrieve the results. If two or more groups have the same name, the name would only refer to the first one with a match.
|
|
[codeblock]
|
|
var regex = RegEx.new()
|
|
regex.compile("d(?<digit>[0-9]+)|x(?<digit>[0-9a-f]+)")
|
|
var result = regex.search("the number is x2f")
|
|
if result:
|
|
print(result.get_string("digit")) # Would print 2f
|
|
[/codeblock]
|
|
If you need to process multiple results, [method search_all] generates a list of all non-overlapping results. This can be combined with a [code]for[/code] loop for convenience.
|
|
[codeblock]
|
|
for result in regex.search_all("d01, d03, d0c, x3f and x42"):
|
|
print(result.get_string("digit"))
|
|
# Would print 01 03 0 3f 42
|
|
[/codeblock]
|
|
[b]Example of splitting a string using a RegEx:[/b]
|
|
[codeblock]
|
|
var regex = RegEx.new()
|
|
regex.compile("\\S+") # Negated whitespace character class.
|
|
var results = []
|
|
for result in regex.search_all("One Two \n\tThree"):
|
|
results.push_back(result.get_string())
|
|
# The `results` array now contains "One", "Two", "Three".
|
|
[/codeblock]
|
|
[b]Note:[/b] Pandemonium's regex implementation is based on the [url=https://www.pcre.org/]PCRE2[/url] library. You can view the full pattern reference [url=https://www.pcre.org/current/doc/html/pcre2pattern.html]here[/url].
|
|
[b]Tip:[/b] You can use [url=https://regexr.com/]Regexr[/url] to test regular expressions online.
|
|
</description>
|
|
<tutorials>
|
|
</tutorials>
|
|
<methods>
|
|
<method name="clear">
|
|
<return type="void" />
|
|
<description>
|
|
This method resets the state of the object, as if it was freshly created. Namely, it unassigns the regular expression of this object.
|
|
</description>
|
|
</method>
|
|
<method name="compile">
|
|
<return type="int" enum="Error" />
|
|
<argument index="0" name="pattern" type="String" />
|
|
<description>
|
|
Compiles and assign the search pattern to use. Returns [constant OK] if the compilation is successful. If an error is encountered, details are printed to standard output and an error is returned.
|
|
</description>
|
|
</method>
|
|
<method name="get_group_count" qualifiers="const">
|
|
<return type="int" />
|
|
<description>
|
|
Returns the number of capturing groups in compiled pattern.
|
|
</description>
|
|
</method>
|
|
<method name="get_names" qualifiers="const">
|
|
<return type="Array" />
|
|
<description>
|
|
Returns an array of names of named capturing groups in the compiled pattern. They are ordered by appearance.
|
|
</description>
|
|
</method>
|
|
<method name="get_pattern" qualifiers="const">
|
|
<return type="String" />
|
|
<description>
|
|
Returns the original search pattern that was compiled.
|
|
</description>
|
|
</method>
|
|
<method name="is_valid" qualifiers="const">
|
|
<return type="bool" />
|
|
<description>
|
|
Returns whether this object has a valid search pattern assigned.
|
|
</description>
|
|
</method>
|
|
<method name="search" qualifiers="const">
|
|
<return type="RegExMatch" />
|
|
<argument index="0" name="subject" type="String" />
|
|
<argument index="1" name="offset" type="int" default="0" />
|
|
<argument index="2" name="end" type="int" default="-1" />
|
|
<description>
|
|
Searches the text for the compiled pattern. Returns a [RegExMatch] container of the first matching result if found, otherwise [code]null[/code].
|
|
The region to search within can be specified with [code]offset[/code] and [code]end[/code]. This is useful when searching for another match in the same [code]subject[/code] by calling this method again after a previous success. Setting these parameters differs from passing over a shortened string. For example, the start anchor [code]^[/code] is not affected by [code]offset[/code], and the character before [code]offset[/code] will be checked for the word boundary [code]\b[/code].
|
|
</description>
|
|
</method>
|
|
<method name="search_all" qualifiers="const">
|
|
<return type="Array" />
|
|
<argument index="0" name="subject" type="String" />
|
|
<argument index="1" name="offset" type="int" default="0" />
|
|
<argument index="2" name="end" type="int" default="-1" />
|
|
<description>
|
|
Searches the text for the compiled pattern. Returns an array of [RegExMatch] containers for each non-overlapping result. If no results were found, an empty array is returned instead.
|
|
The region to search within can be specified with [code]offset[/code] and [code]end[/code]. This is useful when searching for another match in the same [code]subject[/code] by calling this method again after a previous success. Setting these parameters differs from passing over a shortened string. For example, the start anchor [code]^[/code] is not affected by [code]offset[/code], and the character before [code]offset[/code] will be checked for the word boundary [code]\b[/code].
|
|
</description>
|
|
</method>
|
|
<method name="sub" qualifiers="const">
|
|
<return type="String" />
|
|
<argument index="0" name="subject" type="String" />
|
|
<argument index="1" name="replacement" type="String" />
|
|
<argument index="2" name="all" type="bool" default="false" />
|
|
<argument index="3" name="offset" type="int" default="0" />
|
|
<argument index="4" name="end" type="int" default="-1" />
|
|
<description>
|
|
Searches the text for the compiled pattern and replaces it with the specified string. Escapes and backreferences such as [code]$1[/code] and [code]$name[/code] are expanded and resolved. By default, only the first instance is replaced, but it can be changed for all instances (global replacement).
|
|
The region to search within can be specified with [code]offset[/code] and [code]end[/code]. This is useful when searching for another match in the same [code]subject[/code] by calling this method again after a previous success. Setting these parameters differs from passing over a shortened string. For example, the start anchor [code]^[/code] is not affected by [code]offset[/code], and the character before [code]offset[/code] will be checked for the word boundary [code]\b[/code].
|
|
</description>
|
|
</method>
|
|
</methods>
|
|
<constants>
|
|
</constants>
|
|
</class>
|