Golang regex replace excluding quoted strings

Answer #1 100 %

These do not preserve formatting


Preferred way (produces a NULL if group 1 is not matched)
works in golang playground -

     # https://play.golang.org/p/yKtPk5QCQV
     # fmt.Println(reg.ReplaceAllString(txt, "$1"))
     # (?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//[^\n]*(?:\n|$))|("[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|[\S\s][^/"'\\]*)

     (?:                              # Comments 
          /\*                              # Start /* .. */ comment
          [^*]* \*+
          (?: [^/*] [^*]* \*+ )*
          /                                # End /* .. */ comment
       |  
          //  [^\n]*                       # Start // comment
          (?: \n | $ )                     # End // comment
     )
  |  
     (                                # (1 start), Non - comments 
          "
          [^"\\]*                          # Double quoted text
          (?: \\ [\S\s] [^"\\]* )*
          "
       |  
          '
          [^'\\]*                          # Single quoted text
          (?: \\ [\S\s] [^'\\]* )*
          ' 
       |  [\S\s]                           # Any other char
          [^/"'\\]*                        # Chars which doesn't start a comment, string, escape, or line continuation (escape + newline)
     )                                # (1 end)

Alternative way (group 1 is always matched, but could be empty)
works in golang playground -

 # https://play.golang.org/p/7FDGZSmMtP
 # fmt.Println(reg.ReplaceAllString(txt, "$1"))
 # (?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//[^\n]*(?:\n|$))?((?:"[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|[\S\s][^/"'\\]*)?)     

 (?:                              # Comments 
      /\*                              # Start /* .. */ comment
      [^*]* \*+
      (?: [^/*] [^*]* \*+ )*
      /                                # End /* .. */ comment
   |  
      //  [^\n]*                       # Start // comment
      (?: \n | $ )                     # End // comment
 )?
 (                                # (1 start), Non - comments 
      (?:
           "
           [^"\\]*                          # Double quoted text
           (?: \\ [\S\s] [^"\\]* )*
           "
        |  
           '
           [^'\\]*                          # Single quoted text
           (?: \\ [\S\s] [^'\\]* )*
           ' 
        |  [\S\s]                           # Any other char
           [^/"'\\]*                        # Chars which doesn't start a comment, string, escape, or line continuation (escape + newline)
      )?
 )                                # (1 end)

The Cadilac - Preserves Formatting

(Unfortunately, this can't be done in Golang because Golang cannot do Assertions)
Posted incase you move to a different regex engine.

     # raw:   ((?:(?:^[ \t]*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|/\*|//)))?|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|/\*|//))|(?=\r?\n))))+)|("[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?:\r?\n|[\S\s])[^/"'\\\s]*)
     # delimited:  /((?:(?:^[ \t]*)?(?:\/\*[^*]*\*+(?:[^\/*][^*]*\*+)*\/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/)))?|\/\/(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/))|(?=\r?\n))))+)|("[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?:\r?\n|[\S\s])[^\/"'\\\s]*)/

     (                                # (1 start), Comments 
          (?:
               (?: ^ [ \t]* )?                  # <- To preserve formatting
               (?:
                    /\*                              # Start /* .. */ comment
                    [^*]* \*+
                    (?: [^/*] [^*]* \*+ )*
                    /                                # End /* .. */ comment
                    (?:                              # <- To preserve formatting 
                         [ \t]* \r? \n                                      
                         (?=
                              [ \t]*                  
                              (?: \r? \n | /\* | // )
                         )
                    )?
                 |  
                    //                               # Start // comment
                    (?:                              # Possible line-continuation
                         [^\\] 
                      |  \\ 
                         (?: \r? \n )?
                    )*?
                    (?:                              # End // comment
                         \r? \n                               
                         (?=                              # <- To preserve formatting
                              [ \t]*                          
                              (?: \r? \n | /\* | // )
                         )
                      |  (?= \r? \n )
                    )
               )
          )+                               # Grab multiple comment blocks if need be
     )                                # (1 end)

  |                                 ## OR

     (                                # (2 start), Non - comments 
          "
          [^"\\]*                          # Double quoted text
          (?: \\ [\S\s] [^"\\]* )*
          "
       |  
          '
          [^'\\]*                          # Single quoted text
          (?: \\ [\S\s] [^'\\]* )*
          ' 
       |  
          (?: \r? \n | [\S\s] )            # Linebreak or Any other char
          [^/"'\\\s]*                      # Chars which doesn't start a comment, string, escape,
                                           # or line continuation (escape + newline)
     )                                # (2 end)



Tags: regexgo

You’ll also like:


© 2023 CodeForDev.com -