Java Regex 2 - Duplicate Words

Sort by

recency

|

390 Discussions

|

  • + 0 comments

    import java.util.Scanner; import java.util.regex.Pattern;

    public class DuplicateWords {

    public static void main(String[] args) {
        final String regex = "\\b(\\w+)(\\s+\\1\\b)+";
        final Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
    
        final Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());
    
        while (numSentences-- > 0) {
            String input = in.nextLine();
            // Replace all duplicates with first occurrence
            input = p.matcher(input).replaceAll("$1");
            System.out.println(input);
        }
    

    This is the code I am getting the o/p same as shown but still getting the wrong answer.

        in.close();
    }
    

    }

  • + 1 comment

    With this code I got the same expected results except for a mysterious character at the end of the last test case. It is neither a whitespace or unprintable char. It makes no sense for me.

    import java.util.Scanner;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class DuplicateWords {
    
        public static void main(String[] args) {
    
            String regex = "\\b(\\w+)\\s+\\1\\b";
            Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
    
            Scanner in = new Scanner(System.in);
            int numSentences = Integer.parseInt(in.nextLine());
            
            while (numSentences-- > 0) {
                String input = in.nextLine();
                Matcher m = p.matcher(input);
                
                // Check for subsequences of input that match the compiled pattern
                while (m.find()) {
                    input = input.replaceAll("(?i)"+regex,"$1");
                    //m = p.matcher(input);
                }
                
                // Prints the modified sentence.
                System.out.print(input+'\n');
            }
    			
            in.close();
        }
    }
    
  • + 0 comments

    String regex = "\b(\w+)(\s+\1\b)+"; Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

        Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());
    
        while (numSentences-- > 0) {
            String input = in.nextLine();
    
            Matcher m = p.matcher(input);
    
            // Check for subsequences of input that match the compiled pattern
            while (m.find()) {
                input = input.replaceAll(m.group(), m.group(1));
            }
    
            // Prints the modified sentence.
            System.out.println(input);
    
  • + 1 comment

    my solution: 1. \b - is a word delimiter 2. \w+ - any word, letter, digit or underscore 3. \s+ - blank spaces 4. \1 - back reference (anything captured by (\w+))

    public static void main(String[] args) {
    
            String regex = "\\b(\\w+)(\\s\\1)+\\b";
            Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE /* Insert the correct Pattern flag here.*/);
    
            Scanner in = new Scanner(System.in);
            int numSentences = Integer.parseInt(in.nextLine());
            
            while (numSentences-- > 0) {
                String input = in.nextLine();
                
                Matcher m = p.matcher(input);
                
                // Check for subsequences of input that match the compiled pattern
                while (m.find()) {
                    input = input.replaceAll(m.group()/* The regex to replace */, m.group(1)/* The replacement. */);
                }
                
                // Prints the modified sentence.
                System.out.println(input);
            }
            
            in.close();
        }
    
  • + 0 comments

    Is this what we really want? I think that this paragraph should be improved

    3. Write the two necessary arguments for replaceAll such** that each repeated word is replaced with the very first instance the word found in the sentence**. It must be the exact first occurrence of the word, as the expected output is case-sensitive.