
str_match_regex
Returns a list of strings that contain the given substring (case sensitive), allowing for regular expressions.
Available in version 6.3.0 and later.
Prototype
function str_match_regex ( string_array [*] : string, expression [1] : string ) return_val [*] : string
Arguments
string_arrayA string array of any dimensionality.
expressionThe string expression to be matched, with possible regex ("regular expressions") syntax included.
Description
This function returns an array of strings with every occurrence of expression matched in string_array. Unlike str_match, regular expressions are allowed.
If there is no expression matched in string_array, the default string missing value ("missing") will be returned.
Note that str_match_regex is case SENSITIVE. Use str_match_ic_regex if you need case insensitivity.
A full description of the syntax and capabilities of regular expressions is beyond the scope of this document. See the Unix/POSIX man page for REGEX (7) or similar documentation for a complete explanation, noting that NCL's implementation uses the "modern" form of regular expressions. In reality only a very small subset of the full functionality will be needed for the purposes of this function.
For those not familiar with the topic one basic point is that unlike the use in a directory listing of the asterisk ('*') as a wildcard standing for any number of arbitrary characters, the equivalent operator in a regular expression consists of the two character sequence: '.*'.
See Also
str_match_ic_regex, str_match_ind_regex, str_match_ind_ic_regex, str_index_of_substr, str_sub_str, str_match_ic, str_match_ind, str_match_ind_ic
Examples
Since the "regex" based string matching functions are similar in how they work, the examples below use a mix of the various regex functions to show how they all work.
Example 1
The following set of examples uses the same "strings" array to do different kind of searches for strings containing the string "line" and/or "Line".
Example 1a
Get the strings and the indexes of all the strings that contain either "line" or "Line":
strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString",\ "polyline","polyline_ndc","123_line","line_123"/) strs_matched = str_match_regex(strings,"line") strs_matched_ind = str_match_ind_regex(strings,"line") print(strings(strs_matched_ind)) print(strs_matched)
Both print statements should output the same set of strings:
(0) polyline (1) polyline_ndc (2) 123_line (3) line_123
Example 1b
Using the same "strings" array, find all strings that contain either "Line" or "line":
strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString",\ "polyline","polyline_ndc","123_line","line_123"/) strs_matched_ind_ic = str_match_ind_ic_regex(strings,"line") strs_matched_ic = str_match_ic_regex(strings,"line") print(strings(strs_matched_ind_ic)) print(strs_matched_ic)
Both print statements should output the same set of strings:
(0) cnLineColor (1) xyLineThicknessF (2) polyline (3) polyline_ndc (4) 123_line (5) line_123
Example 2
Using the same strings array as the previous example, use special regex expressions to further narrow the search for particular strings.
Example 2a
Here we use the "^" character with "[a-z][a-z]" to only match the strings that start with exactly two characters before the word "line" or "Line":
strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString", \ "polyline","polyline_ndc","123_line","line_123"/) strs_matched_ind_ic = str_match_ind_ic_regex(strings,"^[a-z][a-z]line") strs_matched_ic = str_match_ic_regex(strings,"^[a-z][a-z]line") print(strings(strs_matched_ind_ic)) print(strs_matched_ic)
Both print statements should output the same set of strings:
(0) cnLineColor (1) xyLineThicknessFExample 2b
This is similar to the previous example, but now match strings that end in "line" using the "$" character:
strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString", \ "polyline","polyline_ndc","123_line","line_123"/) strs_matched_ind = str_match_ind_regex(strings,"line$") strs_matched = str_match_regex(strings,"line$") print(strings(strs_matched_ind)) print(strs_matched)
Both print statements should output the same set of strings:
(0) polyline (1) 123_lineExample 3
Assume you have a mix of date strings with the format "YYYY-MM" or "YYYY-MM-DD" and you want to return only the ones with "YYYY-MM-DD".
dates = (/"1965-01","1965-01-15","1966-01","1966-01-15","1967-01","1967-01-15"/) yyyymm = str_match_regex(dates,"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]") yyyymm_ind = str_match_ind_regex(dates,"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]") print(yyyymm) print(dates(yyyymm_ind))The output from both print statements should be the same:
(0) 1965-01-15 (1) 1966-01-15 (2) 1967-01-15
Example 4
Get all contour resources ending with "Font" from the list of contour resources:
class_out = NhlGetClassResources("contourPlotClass", "") cn_strs = str_match_regex(class_out, "Font$") print(cn_strs)Output:
(0) cnConstFLabelFont (1) cnHighLabelFont (2) cnInfoLabelFont (3) cnLineLabelFont (4) cnLowLabelFont (5) lbLabelFont (6) lbTitleFont (7) lgLabelFont (8) lgLineLabelFont (9) lgTitleFont (10) tiMainFont (11) tiXAxisFont (12) tiYAxisFont (13) tmXBLabelFont (14) tmXTLabelFont (15) tmYLLabelFont (16) tmYRLabelFont
Example 5
To further narrow the list of resources, get all contour resources starting with "cn" and ending with "Font":
class_out = NhlGetClassResources("contourPlotClass", "") cn_strs = str_match_regex(class_out, "^cn.*Font$") print(cn_strs)Output:
(0) cnConstFLabelFont (1) cnHighLabelFont (2) cnInfoLabelFont (3) cnLineLabelFont (4) cnLowLabelFont
Note: the NhlGetClassResources function actually allows for regular expressions too, so you really don't need to use str_match_regex:
cn_strs = NhlGetClassResources("contourPlotClass", "^cn.*Font$")