NCL Home > Documentation > Functions > String manipulation

str_match_regex

Returns a list of strings that contain the given substring (case sensitive), allowing for regular expressions.

Available in version 6.3.0 and later.

Prototype

	function str_match_regex (
		string_array [*] : string,  
		expression   [1] : string   
	)

	return_val [*] :  string

Arguments

string_array

A string array of any dimensionality.

expression

The string expression to be matched, with possible regex ("regular expressions") syntax included.

Description

This function returns an array of strings with every occurrence of expression matched in string_array. Unlike str_match, regular expressions are allowed.

If there is no expression matched in string_array, the default string missing value ("missing") will be returned.

Note that str_match_regex is case SENSITIVE. Use str_match_ic_regex if you need case insensitivity.

A full description of the syntax and capabilities of regular expressions is beyond the scope of this document. See the Unix/POSIX man page for REGEX (7) or similar documentation for a complete explanation, noting that NCL's implementation uses the "modern" form of regular expressions. In reality only a very small subset of the full functionality will be needed for the purposes of this function.

For those not familiar with the topic one basic point is that unlike the use in a directory listing of the asterisk ('*') as a wildcard standing for any number of arbitrary characters, the equivalent operator in a regular expression consists of the two character sequence: '.*'.

Examples

Since the "regex" based string matching functions are similar in how they work, the examples below use a mix of the various regex functions to show how they all work.

Example 1

The following set of examples uses the same "strings" array to do different kind of searches for strings containing the string "line" and/or "Line".

Example 1a

Get the strings and the indexes of all the strings that contain either "line" or "Line":

  strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString",\
              "polyline","polyline_ndc","123_line","line_123"/)

  strs_matched     = str_match_regex(strings,"line")
  strs_matched_ind = str_match_ind_regex(strings,"line")

  print(strings(strs_matched_ind))
  print(strs_matched)

Both print statements should output the same set of strings:

  (0)     polyline
  (1)     polyline_ndc
  (2)     123_line
  (3)     line_123

Example 1b

Using the same "strings" array, find all strings that contain either "Line" or "line":

  strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString",\
              "polyline","polyline_ndc","123_line","line_123"/)

  strs_matched_ind_ic = str_match_ind_ic_regex(strings,"line")
  strs_matched_ic     = str_match_ic_regex(strings,"line")

  print(strings(strs_matched_ind_ic))
  print(strs_matched_ic)

Both print statements should output the same set of strings:

  (0)     cnLineColor
  (1)     xyLineThicknessF
  (2)     polyline
  (3)     polyline_ndc
  (4)     123_line
  (5)     line_123

Example 2

Using the same strings array as the previous example, use special regex expressions to further narrow the search for particular strings.

Example 2a

Here we use the "^" character with "[a-z][a-z]" to only match the strings that start with exactly two characters before the word "line" or "Line":

  strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString", \
              "polyline","polyline_ndc","123_line","line_123"/)

  strs_matched_ind_ic = str_match_ind_ic_regex(strings,"^[a-z][a-z]line")
  strs_matched_ic     = str_match_ic_regex(strings,"^[a-z][a-z]line")

  print(strings(strs_matched_ind_ic))
  print(strs_matched_ic)

Both print statements should output the same set of strings:

  (0)     cnLineColor
  (1)     xyLineThicknessF

Example 2b

This is similar to the previous example, but now match strings that end in "line" using the "$" character:

  strings = (/"cnLineColor","mpFillColor","xyLineThicknessF","txString", \
              "polyline","polyline_ndc","123_line","line_123"/)

  strs_matched_ind = str_match_ind_regex(strings,"line$")
  strs_matched     = str_match_regex(strings,"line$")

  print(strings(strs_matched_ind))
  print(strs_matched)

Both print statements should output the same set of strings:

  (0)     polyline
  (1)     123_line

Example 3

Assume you have a mix of date strings with the format "YYYY-MM" or "YYYY-MM-DD" and you want to return only the ones with "YYYY-MM-DD".

  dates  = (/"1965-01","1965-01-15","1966-01","1966-01-15","1967-01","1967-01-15"/)

  yyyymm     = str_match_regex(dates,"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]")
  yyyymm_ind = str_match_ind_regex(dates,"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]")

  print(yyyymm)
  print(dates(yyyymm_ind))

The output from both print statements should be the same:

  (0)     1965-01-15
  (1)     1966-01-15
  (2)     1967-01-15

Example 4

Get all contour resources ending with "Font" from the list of contour resources:

 class_out = NhlGetClassResources("contourPlotClass", "")

 cn_strs = str_match_regex(class_out, "Font$")

 print(cn_strs)

Output:

  (0)     cnConstFLabelFont
  (1)     cnHighLabelFont
  (2)     cnInfoLabelFont
  (3)     cnLineLabelFont
  (4)     cnLowLabelFont
  (5)     lbLabelFont
  (6)     lbTitleFont
  (7)     lgLabelFont
  (8)     lgLineLabelFont
  (9)     lgTitleFont
  (10)    tiMainFont
  (11)    tiXAxisFont
  (12)    tiYAxisFont
  (13)    tmXBLabelFont
  (14)    tmXTLabelFont
  (15)    tmYLLabelFont
  (16)    tmYRLabelFont

Example 5

To further narrow the list of resources, get all contour resources starting with "cn" and ending with "Font":

 class_out = NhlGetClassResources("contourPlotClass", "")

 cn_strs = str_match_regex(class_out, "^cn.*Font$")

 print(cn_strs)

Output:

  (0)     cnConstFLabelFont
  (1)     cnHighLabelFont
  (2)     cnInfoLabelFont
  (3)     cnLineLabelFont
  (4)     cnLowLabelFont

Note: the NhlGetClassResources function actually allows for regular expressions too, so you really don't need to use str_match_regex:

 cn_strs = NhlGetClassResources("contourPlotClass", "^cn.*Font$")

str_match_regex

Prototype

Arguments

Description

See Also

Examples