Extremely Serious

Category: RegEx

Regex Capture Groups with Java

The following java code extracts the group, artifact and version using regex capture groups:

import java.util.regex.Pattern;

public class Main {

    public static void main(String ... args) {
        //Text to extract the group, artifact and version
        var text = "org.junit.jupiter:junit-jupiter-api:5.7.0";

        //Regex capture groups for Group:Artifact:Version
        var pattern = "(.*):(.*):(.*)"; 

        var compiledPattern = Pattern.compile(pattern);
        var matcher = compiledPattern.matcher(text);
        if (matcher.find( )) {
            System.out.println("Whole text: " + matcher.group(0) );
            System.out.println("Group: " + matcher.group(1) );
            System.out.println("Artifact: " + matcher.group(2) );
            System.out.println("Version: " + matcher.group(3) );
        } else {
            System.out.println("NO MATCH");
        }
    }
}

Output

Whole text: org.junit.jupiter:junit-jupiter-api:5.7.0
Group: org.junit.jupiter
Artifact: junit-jupiter-api
Version: 5.7.0

Retrieving the Versions from maven-metadata.xml

Groovy Snippet

List<String> getMavenVersions(String metadataXmlURL) {
    def strVersions = new ArrayList<String>()
    def mvnData = new URL(metadataXmlURL)
    def mvnCN = mvnData.openConnection()
    mvnCN.requestMethod = 'GET'

    if (mvnCN.responseCode==200) {
        def rawResponse = mvnCN.inputStream.text
        def versionMatcher = rawResponse =~ '<version>(.*)</version>'
        while(versionMatcher.find()) {
            for (nVersion in versionMatcher) {
                strVersions.add(nVersion[1]);
            }
        }
    }

    strVersions.sort {v1, v2 ->
        v2.compareTo(v1)
    }

    return strVersions
}

Example Usage

def metatdataAddress = 'https://repo.maven.apache.org/maven2/xyz/ronella/casual/trivial-chunk/maven-metadata.xml'
def versions = getMavenVersions(metatdataAddress)
println versions

Removing the Timestamp from the downloaded SNAPSHOT

Use case

The downloaded snapshot has timestamp associated with it like the following:

artifact-1.0.0-20211012.041152-1.jar

But the tooling is expecting an absolute name like the the following:

artifact-1.0.0-SNAPSHOT.jar

Powershell Script

#The target artifact
$ArtifactId = "artifact"

#The target SNAPSHOT version
$Version = "1.0.0-SNAPSHOT"

if ($Version -match "^(.*)-SNAPSHOT$") 
{
    $Prefix = "{0}-{1}" -f $ArtifactId,$Matches.1
    $Pattern = "^(${Prefix}).*(\.jar)$"

    Get-ChildItem ('.') | ForEach-Object {
        If ($_.Name -match $Pattern) {
            $NewName = "{0}-SNAPSHOT{1}" -f $Matches.1, $Matches.2
            Rename-Item $_ -NewName $NewName
            $Message = "Renaming from {0} to {1}" -f $_.Name, $NewName
            echo $Message
        }
    }
}

RegEx Negative Look Behind with Sublime

Negative look behind with regex is another useful constraint that we can add to our expression. It only matches the right side, if and only if the look behind doesn't match. The look behind doesn’t consume any characters and has the following syntax:

(?<!<look behind>)<right side>

Example:

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true but only matching the opening tag.

Find Replace Comment
(?<!^<)<(?<tag>\w*[^>])>(?=<) <$1 empty=”true”> The negative <look behind> is the one highlighted in blue.

The <right side> is the one highlighted in green.

RegEx Positive Look Behind with Sublime

Positive look behind with regex is another useful constraint that we can add to our expression. It only matches the right side, if and only if the look behind matches. The look behind doesn’t consume any characters and has the following syntax:

(?<=<look behind>)<right side>

Example:

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true but only matching the opening tag.

Find Replace Comment
(?<=\n\s)<(?<tag>\w*[^>])>(?=<) <$1 empty=”true”> The positive <look behind> is the one highlighted in blue.

The <right side> is the one highlighted in green.

RegEx Negative Look Ahead with Sublime

Negative look ahead with regex is another useful constraint that we can add to our expression. It only matches the left side, if and only if the look ahead doesn't match. The look ahead doesn't consume any characters and has the following syntax:

<left side>(?!<look ahead>)

Example:

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true but only matching the opening tag.

Find Replace Comment
<(?<tag>\w*[^>])>(?!(\n|\w)) <$1 empty=”true”> The <left side> is the one highlighted in blue.

The negative <look ahead> is the one highlighted in green.

RegEx Positive Look Ahead with Sublime

Positive look ahead with regex is another useful constraint that we can add to our expression. It only matches the left side, if and only if the look ahead matches. The look ahead doesn't consume any characters and has the following syntax:

<left side>(?=<look ahead>)

Example:

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true but only matching the opening tag.

Find Replace Comment
<(?<tag>\w*[^>])>(?=</\k<tag>>) <$1 empty="true"> The <left side> is the one highlighted in blue.

The positive <look ahead> is the one highlighted in green.

RegEx Named Groups with Sublime

Parentheses in regular expression (RegEx) can be used for grouping expression and can be named. To name it, after the opening parenthesis follows it with the following:

 ?<name>

Within the same expression you can backreference the group using the following:

\k<name>

On the replace field, based on the group position it will be numbered starting from 1 and increasing from the left of the expression and must be preceded with dollar sign (i.e. $). Use this number to access the captured match.

Note: if your group exceeds a single digit use ${<nn>} (e.g. ${10}) notation.

Example

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true.

Find Replace Comment
<(?<tag>\w*[^>])></\k<tag>> <$1 empty="true"></$1> The named group (i.e. blue text) is assigned to $1.

The \k<tag> (i.e. green text) is the backreference.

RegEx Capturing Groups with Sublime

Parentheses in regular expression (RegEx) can be used for grouping expression.

Within the same expression you can backreference the group using the following:

\<GROUP_POSITION>

<GROUP_POSITION> is the quantitative location of the group from left to right starting from 1.

On the replace field, it follows the <GROUP_POSITION> but instead of using backslash (i.e. \) use dollar sign (i.e. $) to precedes it.

Note: if your group exceeds a single digit use ${<nn>} (e.g. ${10}) notation.

Example

<?xml version="1.0"?>
<fruits>
 <a>apple</a>
 <b></b>
 <c>cashew</c>
 <d></d>
</fruits>

From the XML above find all the empty elements and add an attribute empty that is set to true.

Find Replace Comment
<(\w*[^>])></\1> <$1 empty="true"></$1> The group (i.e. blue text) is assigned to $1.

The \1 (i.e. green text) is the backreference.