Research:Revision scoring as a service/Word lists/sco


ISO code Language Generated list Badwords Informal words Stopwords Dictionary Stemmer Contact person Wiki labels Interface Forms Campaign Needs
sco Scots (Wikipedia) 250 70 - 355 custom stop words enchant.Dict - See: Word lists no no no no informal words, stemmer
Generated list [1]

Words in the generated list commonly appear in reverted revisions but not in others. This list is generated using a TF-IDF approach.

  1. accent
  2. adverb
  3. adverbs
  4. aftur
  5. agoo
  6. airish
  7. airleann
  8. airselins
  9. akchtualee
  10. albainis
  11. aleest
  12. alrite
  13. anee
  14. angus
  15. anoyed
  16. apologetic
  17. ass
  18. awgates
  19. bagpipe
  20. bairnag
  21. balang
  22. ballixed
  23. barn
  24. basically
  25. battar
  26. batter
  27. battul
  28. batul
  29. batween
  30. beed
  31. bees
  32. bittock
  33. bnp
  34. boot
  35. bootiful
  36. brawlies
  37. brits
  38. brok
  39. bruilyie
  40. brulzie
  41. cald
  42. calledg
  43. cambat
  44. campletly
  45. catlick
  46. cdb
  47. cheordag
  48. cially
  49. clauses
  50. cleant
  51. clever
  52. colledge
  53. confoozun
  54. continyuus
  55. contractable
  56. conversation
  57. costy
  58. cours
  59. cunt
  60. daftees
  61. dafties
  62. daftys
  63. dat
  64. dats
  65. decided
  66. decidud
  67. dees
  68. delete
  69. demselves
  70. ders
  71. dey
  72. dialekts
  73. didn
  74. differans
  75. diminutives
  76. doos
  77. dose
  78. dunch
  79. edinburghee
  80. eebil
  81. emo
  82. endweys
  83. englang
  84. esparanto
  85. fane
  86. fauchelt
  87. fawks
  88. feardie
  89. feartie
  90. fenian
  91. fixed
  92. frum
  93. fterran
  94. fuck
  95. fucking
  96. fuish
  97. fuishen
  98. gai
  99. gamie
  100. gay
  101. genitalsinspace
  102. geylies
  103. ghallda
  104. glaswegian
  105. glottal
  106. gooverment
  107. goovernment
  108. grutten
  109. gvib
  110. gwib
  111. gwibnbf
  112. haingles
  113. happund
  114. harmless
  115. hauflins
  116. hed
  117. hippocrene
  118. homework
  119. hooman
  120. hoomans
  121. hooseockie
  122. hunderwecht
  123. hurtit
  124. ilkagate
  125. ilkawey
  126. incredibul
  127. invenchoon
  128. invented
  129. irraigular
  130. joost
  131. kilt
  132. kiltie
  133. kingsmore
  134. know
  135. laifs
  136. lalland
  137. lallands
  138. lauchen
  139. leuch
  140. look
  141. ltscotland
  142. luk
  143. macdonald
  144. mad
  145. matters
  146. meen
  147. meens
  148. merge
  149. mooch
  150. moonee
  151. naadays
  152. naintee
  153. naintoon
  154. negation
  155. nobill
  156. nocht
  157. nos
  158. nothin
  159. nown
  160. ockie
  161. olan
  162. oolster
  163. oop
  164. oorapeen
  165. ophilique
  166. orra
  167. overflow
  168. overly
  169. paice
  170. pairisian
  171. pensfuness
  172. playses
  173. poop
  174. postie
  175. proddy
  176. pronunci
  177. pyntless
  178. retaard
  179. retard
  180. reteeners
  181. rhodie
  182. rites
  183. sailed
  184. sampul
  185. sayd
  186. scoot
  187. scoots
  188. scotchland
  189. scotsera
  190. scotsessay
  191. scotstext
  192. scunner
  193. scyttisce
  194. sed
  195. sentans
  196. sereeously
  197. seventoon
  198. sgoteg
  199. shin
  200. shit
  201. shud
  202. sians
  203. signs
  204. skeem
  205. skelpit
  206. skoska
  207. skotskt
  208. sooch
  209. sorry
  210. spawkin
  211. speaks
  212. specialfocus
  213. spellins
  214. spook
  215. steek
  216. stob
  217. stooryduster
  218. stopthebnp
  219. stupid
  220. suck
  221. sudern
  222. sumtin
  223. taek
  224. tageder
  225. tat
  226. tayk
  227. tha
  228. thank
  229. thingies
  230. thonder
  231. tink
  232. tuns
  233. tweentee
  234. udder
  235. uncyclopedia
  236. unstressed
  237. valid
  238. verbless
  239. wanchancie
  240. wanted
  241. warrack
  242. wat
  243. watergaw
  244. wich
  245. widdershins
  246. wifeockie
  247. wifes
  248. wirainleid
  249. wise
  250. words
  251. wordswithoutborders
  252. wrangously
  253. yin
  254. yoosed
  255. you
  256. yound
Generated common words

Common words appear on all revisions reverted or otherwise. In the English language this would include words like 'the' or 'is' which are meaningless on their own. This list is generated using a TF-IDF approach.

  1. aboot
  2. accessdate
  3. airtins
  4. als
  5. and
  6. ane
  7. ang
  8. archive
  9. are
  10. area
  11. arz
  12. ast
  13. atween
  14. aurie
  15. baith
  16. bar
  17. bat
  18. bcl
  19. been
  20. births
  21. blank
  22. bpy
  23. but
  24. caipital
  25. caption
  26. category
  27. ceb
  28. ceety
  29. center
  30. central
  31. centre
  32. century
  33. cite
  34. city
  35. ckb
  36. code
  37. com
  38. commonscat
  39. cried
  40. csb
  41. daiths
  42. date
  43. day
  44. defaultsort
  45. density
  46. destrict
  47. diq
  48. display
  49. dst
  50. durin
  51. east
  52. efter
  53. elevation
  54. end
  55. established
  56. europe
  57. european
  58. ext
  59. file
  60. first
  61. fiu
  62. flag
  63. foondit
  64. footnotes
  65. for
  66. fowk
  67. frae
  68. freemit
  69. from
  70. frp
  71. fur
  72. gan
  73. gov
  74. government
  75. govrenment
  76. hae
  77. haed
  78. haes
  79. hif
  80. history
  81. hsb
  82. htm
  83. html
  84. http
  85. ilo
  86. image
  87. imagesize
  88. info
  89. infobox
  90. inglis
  91. intae
  92. internaitional
  93. ither
  94. its
  95. jpg
  96. kent
  97. kinrick
  98. kintra
  99. kintras
  100. ksh
  101. label
  102. lad
  103. lairgest
  104. land
  105. lang
  106. language
  107. last
  108. latd
  109. latm
  110. latns
  111. lats
  112. leader
  113. leet
  114. left
  115. leid
  116. lij
  117. link
  118. list
  119. livin
  120. lmo
  121. location
  122. locatit
  123. longd
  124. longew
  125. longm
  126. longs
  127. magnitude
  128. main
  129. mair
  130. maist
  131. map
  132. mapsize
  133. may
  134. mayor
  135. members
  136. metro
  137. mey
  138. mhr
  139. min
  140. mony
  141. motto
  142. mzn
  143. nah
  144. naitional
  145. name
  146. nan
  147. nap
  148. national
  149. native
  150. nbsp
  151. nds
  152. new
  153. nickname
  154. north
  155. not
  156. note
  157. nou
  158. nov
  159. nrm
  160. offeecial
  161. official
  162. old
  163. oot
  164. org
  165. other
  166. ower
  167. pairt
  168. pam
  169. party
  170. place
  171. pms
  172. pnb
  173. png
  174. population
  175. position
  176. postal
  177. publisher
  178. pushpin
  179. ref
  180. references
  181. reflist
  182. region
  183. right
  184. rue
  185. sah
  186. scn
  187. scots
  188. seal
  189. see
  190. seicont
  191. settlement
  192. shield
  193. simple
  194. size
  195. skyline
  196. small
  197. smg
  198. some
  199. sooth
  200. state
  201. states
  202. stq
  203. stub
  204. style
  205. svg
  206. syne
  207. tae
  208. thare
  209. that
  210. the
  211. their
  212. this
  213. thumb
  214. time
  215. timezone
  216. title
  217. top
  218. total
  219. toun
  220. twa
  221. type
  222. uised
  223. umwhile
  224. unit
  225. unitit
  226. unner
  227. url
  228. utc
  229. vec
  230. vep
  231. vls
  232. vro
  233. war
  234. warld
  235. wast
  236. web
  237. website
  238. whaur
  239. which
  240. wis
  241. with
  242. wur
  243. wuu
  244. www
  245. xal
  246. xmf
  247. year
  248. years
  249. yue
  250. zea

Bad words

Bad words are words unwelcome on any page. This would include curse words, spam and other content that would be reverted regardless of where it is inserted.

Needs bad words... Use |list-badwords=

Informal words

Informal words are words unwelcome on article namespace but would be acceptable on talk pages. This would include words such as 'hello' or 'hahaha' which would be fine in discussions but not in articles.

Needs informal words... Use |list-informal=