Question 1

What does removing duplicates do?

Accepted Answer

It scans your input, identifies items that appear more than once, and emits a clean version with each unique item kept only one time. This page offers two modes. Line mode treats every line as one item (used for email lists, URL lists, keyword lists, spreadsheet columns). Word mode tokenizes the whole text into words and keeps one of each (used for vocabulary lists, tag-cloud inputs, content audits). Live counts show how many duplicates were removed and how many unique items remain.

Question 2

What is the difference between line dedup and word dedup?

Accepted Answer

Line dedup compares whole lines. Two lines are duplicates only if their entire content matches (subject to the case-insensitive and trim toggles). Word dedup ignores line breaks entirely, extracts every word from the input (runs of letters and digits, with internal hyphens and apostrophes preserved), and keeps one of each. Line dedup is for one-item-per-line lists; word dedup is for prose where you want the unique vocabulary.

Question 3

Should I turn on case-insensitive matching?

Accepted Answer

Yes for most real-world lists. Email addresses are case-insensitive (Alice@Site.com and alice@site.com point to the same mailbox); SEO keyword exports often capitalize the first letter of one variant; sentence-initial words and mid-sentence words match in word mode. Leave it off only when casing is significant, such as case-sensitive identifiers, API keys, or filenames on case-sensitive filesystems.

Question 4

What does the trim option do, and when do I need it?

Accepted Answer

Trim ignores leading and trailing whitespace when comparing items. It only applies in line mode (word mode already trims because the tokenizer skips whitespace). Turn it on when your input was pasted from a spreadsheet or scraped from a page that left stray spaces or tabs at the end of some lines. Without it, "foo" and "foo " (with a trailing space) are treated as different items.

Question 5

What is the difference between keep first and keep last?

Accepted Answer

If your input has the same item three times, only one copy survives. Keep first preserves the earliest occurrence and drops the later ones; keep last preserves the most recent and drops the earlier ones. The surviving order always matches the input order, so keep last is useful when later entries supersede earlier ones (e.g. a log where the last status for an ID is the current one).

Question 6

Is my text uploaded anywhere?

Accepted Answer

No. The duplicate remover runs entirely in your browser. We don't upload, log, or send a single character to any server. Your text is held in your browser's per-tab session storage so a refresh doesn't lose your work, and it clears the moment you close the tab. The page itself is static HTML; the deduplication is a small client-side script. Closing the tab is the only privacy guarantee you ever need.

Cleaning task	Mode + options	Why these settings
Email list	Lines, case-insensitive, trim	Collapses [email protected], [email protected] and stray-space copies into one row. Case + trim are essential because email addresses are case-insensitive in practice and exports often include trailing spaces.
URL list	Lines, trim, sort	Deduplicate scraped or copy-pasted URLs. Trim catches lines that imported with whitespace; sorting alphabetizes them for diffing against a reference list.
Keyword list	Lines, case-insensitive	Strip repeats from an SEO keyword export. Leave case-insensitive on so capitalized variants and exact-match originals merge.
CSV column	Lines, default	Paste one column straight into the box. Default mode keeps case + whitespace exact, so the cleaned column matches what your spreadsheet expects.
Vocabulary list	Words, case-insensitive, sort	Extract unique words from a paragraph or chapter. Sorting helps you scan for the words you actually want to study; case-insensitive avoids treating sentence-initial words as new entries.
Tag cloud input	Words, case-insensitive	Reduce a long body of text to the unique tag candidates before any further analysis. Pair with the keyword-density tool when it ships for frequency-weighting.

Remove Duplicates

What removing duplicates actually does

Settings cheat sheet by use case

Built for cleaning lists fast

What counts as a duplicate

Which copy survives

What you get out

How to use the duplicate remover

Pick a mode

Paste and toggle

Copy the result

Who built this

The WordCounters team

Frequently asked

Try the rest of the tools