General regx command line format

To get help:
    regx help <category>

To find matching (or not matching) lines in a file or files:
   regx [/Pattern] pattern
        [/File filespec] [/Recurse]
        {output options         -- see below}
        {display detail options -- see below}
        {Regex options          -- see below}

To find matching (or not matching) lines in a stream:
   <command> | regx [/Pattern] pattern
        {output options         -- see below}
        {display detail options -- see below}
        {Regex options          -- see below}

To find matching (or not macthing) lines on a web page or pages:

To find matching (or not matching) lines in a file or files:

   regx [/Pattern] pattern
        [/Url weburl] [/Fusk]
        {output options -- see below}
        {display detail options -- see below}
        {Regex options -- see below}

To perform replacements, add
        /Replace substitution


HELP

You can get general help on using regx with any common help idiom:
No parameters:     

regx   

Windows standard:  

regx /?
regx help
  

Common Unix tools: 

regx --h
regx --help
regx -?

You can get help on specific topics by adding a help category:

regx help <category>

Help categories:

Usage       - the same as not providing a category
Regex       - A quick reminder/reference on .Net regex syntax
Matching    - Detailed reminder on .Net matching syntax
Replacing   - .Net replacement/substitution syntax
Parameters  - Detailed description of parameters to regx
Format      - Details on how to use the /Format option
Version     - Version and credits

You can abbreviate the category name, as long as it is unambiguous:

regx help ver


PARAMETER SYNTAX

You may prefix parameters with a slash (Windows standard) or a dash (Unix standard):
    /f <filename>
    -f <filename>

All parameters that take values (/Pattern, /Replace, /Output, /Details, /File) can have their values specified with "=", ":", or whitespace:
    /f <filename>
    /f=<filename>
    /f:<filename>

Switches (parameters than can be true or false) can be specified as:
    /i                              - No value means to set to true
    /i true,  /i=true,  /i:true
    /i false, /i=false, /i:false
    /i+                             - Sets to true
    /i-                             - Sets to false

Parameter names are never case-sensitive. These are equivalent:
    /ipw
    /IPW


THE PARAMETERS


Pattern

pattern
/Pattern pattern
/p pattern

The regex to match against. You can omit the "/pattern" indicator. The first unadorned token on the command line is assumed to be the pattern.

If the pattern contains any whitespace or any characters special to the shell, you should put it in quotes and/or escape the characters.

    regx abc -f inputfile
    regx "ab c" -f inputfile    - Quoted for the space
    regx "&"                    - Quoted for the ampersand
    regx "^"                    - Caret is the cmd escape character and
                                  requires quoting
    regx ^^                     - Escaping the caret works as well as quoting


File

/File <filename>
/File <filespec>
/f <filename>
/f <filespec>

File or files to match the regex against. Each matching file is loaded and matched, one line at a time, against the regex pattern.

/File may be shorted to /f

    /f file.txt         - Matches against the specified file
    /f *.txt            - Matches all .txt files in the current directory
    /f ..\data\*.csv    - Matches all .csv files in the relative directory
    /f q:\data\file.txt - Matches against the specified file

You may not provide both /File and /Url


Recurse

Switch (boolean parameter: true or false)

/Recurse <value>
/r <value>

Values:

  • False (Default)
    regx will search the file or files that match the filename or filespec in the input directory only.
  • True 
    regx will search all directories under the input location and match against any file which matches the filespec or filename

If we have these directories and files:
    c:\data\f1.txt
    c:\data\f2.txt
    c:\data\q1\f1.txt
    c:\data\q1\f3.txt
   
    /f c:\data\*.txt        Will search c:\data\f1.txt and c:\data\f2.txt
    /f c:\data\*.txt /r     Will search all four files
    /f c:\data\f1.txt       Will search only c:\data\f1.txt
    /f c:\data\f1.txt /r    Will search c:\data\f1.txt and c:\data\q1\f1.txt


Url

/Url <url>

The Url of a file. If no scheme is specified, regx tries to uses http: to access the resource.


Fusk

Switch (boolean parameter: true or false)

/Fusk

Causes the /Url parameter to be treated as a fusk pattern. Fusk patterns are regular urls with expansion sections. An expansion section is enclosed in square braces and consists of values or ranges, separated by commas. Ranges are numbers or single letters with a minus in between. If the range is numbers, and if the first number begins with a zero, numbers are padded with zeroes to the length of the starting number.

Examples:

    site.com/page[1-141].html
    Searches 141 pages:
        site.com/page1.html
        site.com/page2.html    
        ...   
        site.com/page141.html

    site.com/page[001-141].html
    Searches 141 pages:
        site.com/page001.html
        site.com/page002.html  
        ...    
        site.com/page141.html

    site.com/page[a-m].html
        site.com/pagea.html    
        ...   
        site.com/pagem.html

    site.com/page[N-Z].html
        site.com/pageN.html    
        ...   
        site.com/pageZ.html

    site.com/page[one,two,three].html
    Three pages:
        site.com/pageone.html
        site.com/pagetwo.html
        site.com/pagethree.html

    site.com/page[01-99,200-250,a-n,_special].html
    Each part of the expansion section is expanded in turn, so this searches:
        site.com/page01            
        ...    
        site.com/page99.html       
                - and -
        site.com/page200.html      
        ...    
        site.com/page250.html       
                                           - and -
        site.com/pagea.html        
        ...    
        site.com/pagen.html        
                                          - and -
        site.com/page_special.html

    site.com/data/run[,_data]_[01-30].xml
    The empty section in the first expansion block ("[,_dataset]") expands to
    nothing. The two expansion blocks are expanded independently and the
    cross-product is searched. This searches:
        site.com/data/run_01.xml      
        ... 
        site.com/data/run_30.xml 
                                           - and -
        site.com/data/run_data_01.xml 
        ... 
        site.com/data/run_data_30.xml

You may not provide both /File and /Url


NoWarnings

Switch (boolean parameter: true or false)

/NoWarnings false       (default)
/NoWarnings

NoWarnings can not be abbreviated, although it is case insensitive.

If true, regx will not output warnings about odd parameter combinations or unrecognized parameters.

Without NoWarnings, this command line will issue a warning:

> regx /Pattern "a" /Output Unmatched
Warning: A replacement pattern was supplied, but "/Output=Unmatched" or
"/v" is in effect. No replacements will occur.

Specifying NoWarnings will suppress these messages.


ErrorsToStdOut

Switch (boolean parameter: true or false)

/ErrorsToStdOut false       (default)
/ErrorsToStdOut

ErrorsToStdOut can not be abbreviated, although it is case insensitive.

If true, errors and warnings will be sent to stdout. By default they are sent to stderr. This can help in some logging situations.


Debug

Switch (boolean parameter: true or false)

/Debug false        (default)
/Debug

Debug can not be abbreviated, although it is case insensitive.

If true, regx will output information about its options, what input it expects to search, and the regex object it creates and then exit. No input will be processed. This can be useful to test your filespecs and regex options without waiting on regx to process input.


Output

Selects which lines to output: those that match the pattern, those that do
not, or all input lines.

Values:
    /o matched      - Output lines from the input that match the regex
    /o unmatched    - Output lines from the input that do not match the regex
    /o all          - Output all lines of the input
                      This is generally only useful with a substitution
                      pattern (/rp)

Both /Output and its values may be abbreviated to any initial letters. Both
are also case insensitive.
    /o=m
    /o a
    /Out:all
    /out match

** Note that /InvertResults (or its shorter form /i) is an abbreviation for
/Output=Unmatched. See notes below on the interaction between the two.


InvertResults

Switch (boolean parameter: true or false)
Sets the output selection to Unmatched unless overridden by /Out=All
/InvertResults can not be abbreviated, although it is case insensitive.
/i  and /Invert are shorthand for /InsertResults.

/InvertResults false    (default)
/InvertResults          Sets to true
/Invert
/i                      Sets to true

If InvertResults is true, it combines with /Output the following way:

--------------------------------------------------------
Output=         |   /InvertResults+    /InvertResults-
--------------------------------------------------------
(unspecified)   |   Unmatched          Matched
All             |   All                All
Matched         |   Unmatched          Matched
Unmatched       |   Unmatched          Unmatched

Specifying /InvertResults and /Output=All will generate a warning unless /NoWarnings is in effect.


ShowRelativeNames

Switch (boolean parameter: true or false)
Causes file names to be shorted to be relative to the /File parameter.

Defaults to true.

Example:
    regx pattern /file c:\test\data\*.txt
        c:\test\data\data1.txt(1) pattern
        c:\test\data\moredata\data2.txt(1) pattern

    regx pattern /file c:\test\data\*.txt /ShowRelativeNames
        data1.txt(1) pattern
        moredata\data2.txt(1) pattern

This option has no effect on /Url searches or searching as a filter.


DetailsRegex options

Specifies what details to add to output lines from regx.
/Details and its values may be abbreviated and are case insensitive:

/details none
/detail line
/d File

Values:
    /d Smart        (default) Becomes "LineNumber" in filter mode and
                    when parsing a single file. It becomes "FileAndLine"
                    when a filespec matches multiple files
    /d None         Output lines are emitted (possibly with replacements)
                    with no extra adornment. It is especially useful when
                    replacing (/rp)
    /d LineNumber   Output lines have their line number in the input prepended
    /d FileAndLine  Output lines have the full filename of the original file
                    and the line number in the original file prepended

If we have identical source files c:\data\t1.txt and c:\data\t2.txt:
       One Two Three Four Five
       One Three Five Seven
       Two Four Six Eight

  The default, "smart", means "LineNumber" when a single file is processed:
        regx Four -f c:\data\t1.txt /rp 4
        c:\data\t1.txt(1): One Two Three 4 Five
        c:\data\t1.txt(3): Two 4 Six Eight
  and "smart" meants "FileAndLine" when the filespec matches multiple files:
        regx Four -f c:\data\t?.txt /rp 4
        c:\data\t1.txt(1): One Two Three 4 Five
        c:\data\t1.txt(3): Two 4 Six Eight
        c:\data\t2.txt(1): One Two Three 4 Five
        c:\data\t2.txt(3): Two 4 Six Eight

  Smart always means "LineNumber" in filter mode:
        type c:\data\t1.txt | regx Four /rp 4
        1: One Two Three 4 Five
        3: Two 4 Six Eight

  "None" removes all adornment. It is most useful when replacing.
        regx Four -f c:\data\t1.txt /d n /rp 4
        One Two Three 4 Five
        Two 4 Six Eight


All of these are switches (boolean options which can be true or false)
All default to false
Except as specified, these switched cannot be abbreviated, although they are case insensitive.

/CaseInsensitive
/i
Turns off case matching in the regex.

/IgnorePatternWhitespace
/ipw
Eliminates unescaped white space from the pattern and enables comments marked with #.
This can be useful when building regex programmatically and can make your regex more readable.

/CultureInvariant
Specifies that cultural differences in language is ignored.

/ExplicitCaptures
/ExplicitCapturesOnly

Specifies that the only valid captures are explicitly named or numbered groups of the form (?<name>…). This allows unnamed parentheses to act as noncapturing groups without the syntactic clumsiness of the expression (?:…)

/Uncompiled
/u

By default, regx passes the Compiled flag when creating Regex objects. This flag slows down creation of the object, but allows it to run faster over repeated use, such as when processing the same regex on a large input file. If you are running regx many times in a script over small amounts of input, using the /u flag may speed up your script.

/RightToLeft
/rtl

Turns on right-to-left matching. Note that in /rtl mode search patterns  are still read left-to-right and lookaround expressions do not change their direction.


Syntax

Allows switching to ECMA regex syntax

    /Syntax DotNet      (default)
    /Syntax ECMA        Specified to use ECMA syntax

ECMA syntax is incompatible with /CaseInsensitive and requires /Uncompiled.

You may want to use ECMA syntax when working with regex intended for use in JavaScript.


Exclusions

regx does not currently support the .Net Regex options Multiline or Singleline.

Last edited Oct 15, 2012 at 7:05 AM by SethMorris, version 3

Comments

No comments yet.