May 15

Multiple times I've been searching (too long) for a working IPv6 regular expression. There's a lot of crap out of there which doesn't take into account certain cases. Of course you only get to know which one works best if you test them all. I've tried A LOT and finally found the right one

As a little reminder for myself, and perhaps a helpful hand for somebody else, if found this page useful and working fine.

The regex itself is:

s*((([0-9A-Fa-f]{1,4}:){7}(([0-9A-Fa-f]{1,4})|:))|
(([0-9A-Fa-f]{1,4}:){6}(:|((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})|(:[0-9A-Fa-f]{1,4})))|
(([0-9A-Fa-f]{1,4}:){5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){4}(:[0-9A-Fa-f]{1,4}){0,1}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){3}(:[0-9A-Fa-f]{1,4})
{0,2}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){2}(:[0-9A-Fa-f]{1,4}){0,3}
((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:)(:[0-9A-Fa-f]{1,4}){0,4}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(:(:[0-9A-Fa-f]{1,4}){0,5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})))(%.+)?\s*

Which, for example in PHP, will become:

define('IPV6_REGEX', "/^\s*((([0-9A-Fa-f]{1,4}:){7}
(([0-9A-Fa-f]{1,4})|:))|(([0-9A-Fa-f]{1,4}:){6}
(:|((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})|(:[0-9A-Fa-f]{1,4})))|
(([0-9A-Fa-f]{1,4}:){5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){4}
(:[0-9A-Fa-f]{1,4}){0,1}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){3}
(:[0-9A-Fa-f]{1,4}){0,2}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){2}(:[0-9A-Fa-f]{1,4}){0,3}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:)(:[0-9A-Fa-f]{1,4}){0,4}
((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(:(:[0-9A-Fa-f]{1,4}){0,5}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})))(%.+)?\s*$/");

8 Responses to “Working IPv6 regular expression”

  1. Aeron says:

    The http://forums.dartware.com/viewtopic.php?t=452 regex allows the following:
    1111:2222:3333:4444::5555:
    1111:2222:3333::5555:
    1111:2222::5555:
    1111::5555:
    ::5555:

    Which are invalid.
    I’ve email the author.

  2. Patrick says:

    Nice work Aeron!

    It’s a shame it doesn’t work in 100% of the cases, but I must say that it’s the best I’ve seen so far. If the author could correct the error, that would be fantastic :)

  3. Aeron says:

    The regex allows the use of leading zero’s in the IPv4 parts.

    Some Unix and Mac distro’s convert those segments into octals

    I suggest using: 25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d as a IPv4 segement.

  4. Stephen Ryan says:

    I’m working on an updated version; I’ve replied to Aeron and if my updates work for him, I’ll post it at the original Dartware site and send it to you as well.

  5. Patrick says:

    Cool!
    Thanks very much Stephen!

  6. Jonas says:

    I am using the regexp for ipv6 (copied from dartware 13 January) in my erlang programming.
    But I have problem with two of the examples. Maybe you can help me.

    Since I am not using Perl, but the erlang re module. I have removed the /^\s* from the beginning and the (%.+)?\s*$/ from the end.

    I got nomatch when trying these two:
    fe80:0000:0000:0000:0204:61ff:254.157.241.086 // IPv4 dotted quad at the end
    fe80:0:0:0:0204:61ff:254.157.241.86 // drop leading zeroes, IPv4 dotted quad at the end

    I still got a match for the address 1111:2222:3333:4444::5555:
    Which is invalid according to post 1.

    Do you see any explanation or suggestion?

    Many thanks for a reply

  7. Patrick says:

    Hi Jonas,

    I’ve tested the regex on RegexTester
    With the preg dialect.

    When doing that, only the first one you mention (fe80:0000:0000:0000:0204:61ff:254.157.241.086 // IPv4 dotted quad at the end) fails. I should have a look if the update Stephen Ryan (see above) sent is already made available.

  8. Rich Brown says:

    Folks,

    We have done some maintenance on that Regular Expression knowledgebase article at InterMapper, and as of early February 2010, it should handle all those cases properly. (http://forums.dartware.com/viewtopic.php?t=452)

    Note that there are additional articles that link to Perl, Javascript, Ruby, and Java implementations.

    The Javascript on the IPv6 Address Validator page (http://intermapper.com/ipv6validator) also converts the address to its “best representation”.

    Best regards,

    Rich Brown
    Dartware, LLC

Leave a Reply


1 × = one

preload preload preload