Sửa đổi IP: Thực hiện quy định riêng tư và Quản lý sai phạm

This page is a translated version of the page IP Editing: Privacy Enhancement and Abuse Mitigation and the translation is 26% complete.

About IP Editing (discuss)
About IP Addresses (discuss)  · IP Editing Restriction Study (formerly Login Required Experiment) (discuss)

IP masking hides the IP addresses of unregistered editors on Wikimedia projects, fully or partially, from everyone except those who need access to fight spam, vandalism, harassment and disinformation.

Currently, anyone can edit Wikimedia wikis without a Wikimedia account or without logging in. MediaWiki, the software behind Wikimedia projects, will record and publish your IP address in its public log. Anyone seeking your IP address will find it.

Wikimedia projects have a good reason for storing and publishing IP addresses: they play a critical role in keeping vandalism and harassment off our wikis.

However, your IP address can tell where you are editing from and can be used to identify you or your device. This is of particular concern if you are editing from a territory where our wikis are deemed controversial. Publishing your IP address may allow others to locate you.

With changes to privacy laws and standards (e.g., the General Data Protection Regulation and the global conversation about privacy that it started), the Wikimedia Foundation Legal team has decided to protect user privacy by hiding IPs from the general public. However, we will continue to give access to users who need to see the addresses to protect the wikis.

We're aware that this change will impact current anti-abuse workflows. We are committed to developing tools or maintaining access to tools that can identify and block vandals, sock puppets, editors with conflicts of interest and other bad actors after IPs are masked.

Cập nhật Sản phẩm và Kỹ thuật

Implementation Strategy and next steps (25 February 2022)

Hello all. We have an update on the IP Masking implementation strategy.

First off, thank you to everyone who arrived on this page and offered their feedback. We heard from a lot of you about how this page is not easy to read and we are working on fixing that. We genuinely want to thank you for taking the time to go through the information here and on the talk page. We took every comment on the talk page into consideration before the decision about the implementation plan was made.

We want to preface this also by saying that there are still a lot of open questions. There is a long road ahead of us on this project and we would like you to voice your opinion in more of these discussions as they come up. If you haven’t already, please go through this post about who will continue to have IP address access before reading further.

We received mixed feedback from the community about the two proposed implementation ideas without a clear consensus either way. Here are some quotes taken from the talk page:

  • For small wikis, I think the IP based approach is better because it is unlikely that two anonymous users will have the same IP, and for a vandal modifying its Ip is most difficult that erasing cookies.
  • The session-based system does seem better, and would make it easier to communicate with anonymous editors. I'm an admin on English Wikipedia, and my main interaction with IP editors is reverting and warning them against vandalism. In several cases recently I haven't even bothered posting a warning, since it seems unlikely the right person would receive it. In one case I was trying to have a conversation about some proposed change, and I was talking to several different IP addresses, and it was unclear that it was actually the same person, and I had to keep asking them about that.
  • As an admin in German-language Wikipedia, of the two paths described here (IP based identity vs. session-based identity) I clearly prefer the IP based approach. It's just too easy to use a browser's privacy mode or to clear the cookies (I'm doing it myself all the time); changing your IP address at least requires a bit more effort, and we have already a policy against using open proxies in place. I agree with Beland that the session-based identity approach could probably make communication with well-meaning unregistered editors easier, but it just doesn't seem robust enough.
  • I prefer the session-based approach. It provides more value in being able to identify and communicate with legitimate anonymous editors. However, at the same time, we need abuse filter options to be able to identify multiple new sessions from a single IP. These could be legitimate (from a school, for example), but will most likely represent abuse or bot activity. One feature I haven't seen mentioned yet. When a session user wants to create an account, it should default to renaming the existing session ID to the new name of their choice. We need to be able to see and/or associate the new named user with their previous session activity.
  • I am leaning towards the IP-based identities, even if encrypted, as cookies seem more complicated to deal with and very bothersome to keep shutting their annoying pop-ups (very standard in Europe). I have to mention that I prefer that till this day, one could use Wikipedia without cookies, unless he wants to log in to edit with his username.
  • The ability to perform purely session-based blocks in addition to the existing IP+session blocking would be an interesting upgrade. Being able to communicate with IPv6 users through their session instead of their repeatedly changing IP address would also be a benefit.

In summary, the main argument against the session-based approach was that cookies are easy to get rid of and the user may change their identity very easily.

And the main arguments against the IP-based approach were:

  • the encryption method can be compromised, hence compromising the IP addresses themselves
  • this approach does not provided the benefit of improved communication with the unregistered editors
  • does not allow for session-based blocking (in addition to IP based blocking)

In light of the above and the discussions with our technical team about the feasibility and wide-ranging implications of this implementation, we have decided to go with the session-based approach with some important additions to address the problem of users deleting their cookies and changing their identity. If a user repeatedly changes their username, it will be possible to link their identities by looking at additional information in the interface. We are still working out the details of how this will work - but it will be similar to how sockpuppet detection works (with some automation).

We are working out a lot of the technical details still and will have another update for you shortly with more specifics. This includes LTA documentation, communication about IPs, AbuseFilters, third-party wikis, gadgets, user-scripts, WMF cloud tools, restrictions for IP-viewer rights etc. We appreciate your input and welcome any feedback you may have for us on the talk page.

Ẩn IP và thay đổi trong công việc (9 tháng 12 năm 2021)
Chúng tôi đã thảo luận về hai cách tiếp cận khác nhau để ẩn đi IP mà chúng tôi đang xem xét. Sau đó, chúng tôi đã đưa ra một vài quy trình ẩn địa chỉ IP khác nhau và chúng sẽ thay đổi như thế nào với hai cách thực hiện khác nhau.

Lưu ý rằng trong cả hai cách lựa chọn khác nhau, bảo quản viên, tiếp viên, kiểm định viên và người dùng có quyền IPViewer sẽ có thể hiển thị IP trên các trang như Thay đổi gần đây và Lịch sử trang cho mục đích chống phá hoại.

Trải nghiệm sửa đổi cho các thành viên chưa đăng ký

Biểu hiện hiện tại: Hiện nay, các thành viên chưa đăng ký có thể sửa đổi mà không cần đăng nhập (trên hầu hết các wiki). Trước khi thực hiện sửa đổi, họ sẽ thấy một biểu ngữ cho họ biết rằng địa chỉ IP của họ sẽ được ghi lại công khai và được xuất hiện vĩnh viến.

Danh tính dựa trên địa chỉ IP: Những biên tập viên chưa đăng ký sẽ có thể chỉnh sửa như hiện tại. Trước khi thực hiện sửa đổi, họ sẽ thấy một thông báo cho họ biết rằng các sửa đổi của họ sẽ được quy cho một phiên bản địa chỉ IP được mã hóa của họ. Bản thân địa chỉ IP gốc sẽ chỉ được hiển thị cho quản trị viên và tuần tra viên. Nó sẽ chỉ được giữ lại trong một khoảng thời gian nhất định nào đó.

Danh tính dựa trên phiên sửa đổi: Điều này tương tự như trên, ngoại trừ người chỉnh sửa sẽ được thông báo rằng các chỉnh sửa của họ sẽ được quy cho một tên người dùng được tạo tự động.

Liên lạc với thành viên chưa đăng ký

Biểu hiện hiện tại: Những biên tập chưa đăng ký được gọi bằng địa chỉ IP của họ hoặc nếu họ là kẻ phá hoại dai dẳng mà đã được biết, họ có thể được đặt tên dựa trên hành vi của họ.

Danh tính dựa trên địa chỉ IP: Tuần tra viên và bảo quản viên sẽ không thể đề cập đến địa chỉ IP một cách công khai nhưng sẽ có thể tham chiếu đến địa chỉ IP được mã hóa của họ hoặc tên kẻ phá hoại dai dẳng. Họ có thể chia sẻ địa chỉ IP cho những người khác có quyền truy cập vào nó.

Danh tính dựa trên phiên sửa đổi: Tuần tra viên và bảo quản viên sẽ không thể nhắc đến công khai địa chỉ IP nhưng sẽ có thể tham chiếu đến tên người dùng được tạo tự động của họ. Họ có thể chia sẻ địa chỉ IP với những người khác có quyền truy cập vào nó. Điều này có thể giúp xác định một thành viên, nhưng cũng có thể gây nhầm lẫn nếu có nhiều IP đứng sau một tên người dùng, tương tự như số người có thể sử dụng chung một IP ngày nay. Để giảm bớt mối lo ngại này, chúng tôi đang xây dựng một công cụ có thể đưa ra thêm thông tin về tất cả các địa chỉ IP khác nhau mà một biên tập viên đang hoạt động.

Trải nghiệm trang thảo luận cho các thành viên chưa đăng ký

Biểu hiện hiện tại: Một biên tập viên chưa đăng ký có thể nhận tin nhắn trên trang thảo luận về IP của họ. Sau khi địa chỉ IP của người biên tập thay đổi, họ sẽ nhận được thông báo trên trang thảo luận về địa chỉ IP mới. Điều này có thể làm chia nhỏ các cuộc trò chuyện và gây khó khăn cho việc liên lạc với các biên tập viên chưa đăng ký.

Danh tính dựa trên địa chỉ IP: Trong cách thực hiện này, biểu hiện vẫn giống như hiện tại. Các biên tập viên chưa đăng ký sẽ nhận được tin nhắn trên các trang thảo luận IP được mã hóa của họ và khi IP của họ thay đổi, trang thảo luận liên quan của họ cũng thay đổi theo.

Danh tính dựa trên phiên sửa đổi: Trong cách thực hiện này, các biên tập viên chưa đăng ký sẽ nhận được thông báo trên một trang thảo luận được liên kết với cookie trên trình duyệt của họ. Ngay cả khi địa chỉ IP của họ thay đổi, điều đó vẫn cho phép họ nhận tin nhắn trên trang thảo luận của họ. Nếu cookie trình duyệt của họ bị xóa, họ không còn giữ lại danh tính theo phiên sửa đổi của mình nữa và sẽ nhận được một cookie mới và một trang thảo luận mới được liên kết với nó. Vì IP thay đổi thường xuyên hơn cookie, có khả năng nhiều người dùng sẽ kết thúc với một trang thảo luận bán cố định trừ khi họ đặc biệt cố gắng không làm như vậy. Một ưu điểm khác cần lưu ý là các tin nhắn trên trang thảo luận sẽ không còn bị nhận nhầm trong bất kỳ trường hợp nào.

Thông báo trang thảo luận

Cấm thành viên chưa đăng ký

Biểu hiện hiện tại: Một bảo quản viên có thể cấm trực tiếp một địa chỉ IP hoặc dải IP. Ngoài ra, họ có thể biến nó thành một cấm tự động có thể giữ lại cookie trên trình duyệt của người dùng bị cấm, ngăn họ chỉnh sửa ngay cả khi họ thay đổi địa chỉ IP. Chức năng này đã được giới thiệu khoảng vài năm gần đây.

Danh tính dựa trên địa chỉ IP: Biểu hiện vẫn giữ nguyên như hiện nay. Các địa chỉ IP đều được ẩn đi theo mặc định, nhưng bảo quản viên hay tuần tra viên có quyền cho phép vẫn có thể truy cập chúng.

Danh tính dựa trên phiên sửa đổi: Việc triển khai này cho phép chúng tôi giữ lại việc cấm theo địa chỉ IP hiện tại. Nó cũng cho phép chúng tôi chỉ thực hiện các tác vụ cấm sửa đổi theo cookie. Điều này có thể hữu ích trong một số trường hợp khi mọi người chia sẻ thiết bị (như thư viện hoặc một quán cafe) và việc cấm một địa chỉ IP hoặc dải địa chỉ IP có thể gây ra những sự cản trở không cần thiết. Tôi muốn chỉ ra rằng điều này sẽ không hoạt động được trong trường hợp những kẻ phá hoại là những người biên tập viên có kinh nghiệm và có thể trốn tránh việc cấm bằng cookie.

Các cách thực hiện ẩn đi địa chỉ IP (Câu hỏi thường gặp) (tháng 10 năm 2021)
Danh sách các câu hỏi thường gặp này giúp trả lời một số câu hỏi có thể xảy ra mà cộng đồng sẽ có về các phương pháp triển khai khác nhau mà chúng tôi có thể thực hiện đối với việc ẩn địa chỉ IP và mỗi phương pháp hoạt động trên sẽ tác động đến cộng đồng như thế nào.

Q: Sau khi thực hiện việc ẩn đi địa chỉ IP, ai sẽ có thể nhìn thấy địa chỉ IP?

A: Kiểm định viên, tiếp viên và bảo quản viên sẽ có thể nhìn thấy đầy đủ các địa chỉ IP bằng cách đồng ý vào một tùy chọn mà họ đồng ý không chia sẻ nó với những người không có quyền truy cập vào thông tin này.

Các biên tập viên tham gia vào các hoạt động chống phá hoại, được cộng đồng tin cậy, vẫn có thể được cấp quyền xem địa chỉ IP để tiếp tục công việc của họ. Quyền người dùng này sẽ được cộng đồng xử lý giống như các quyền người dùng khác và yêu cầu số lần sửa đổi tối thiểu và số ngày tham gia.

Tất cả người dùng có tài khoản trên một thời gian nhất định và có số lần chỉnh sửa tối thiểu (sẽ được xác định) sẽ có thể truy cập vào các IP được hiển thị một phần mà không được cho phép. Điều này có nghĩa là một địa chỉ IP sẽ xuất hiện với những phần đuôi của nó bị ẩn. Địa chỉ này sẽ có thể được truy cập qua một tùy chọn mà họ đồng ý không chia sẻ nó với những người khác không có quyền truy cập vào thông tin này.

Tất cả các thành viên còn lại sẽ không thể truy cập địa chỉ IP cho thành viên chưa đăng ký.

Q: Đâu là những lựa chọn thực hiện về mặt kỹ thuật?

A: Trong vài tuần qua, chúng tôi đã tham gia vào nhiều cuộc thảo luận về các khả năng thực hiện kỹ thuật để đạt được mục tiêu của chúng tôi về việc ẩn đi địa chỉ IP đồng thời giảm thiểu tác động đến các biên tập viên và độc giả của chúng tôi. Chúng tôi đã thu thập được những phản hồi từ các nhóm khác nhau và thu được những quan điểm khác nhau. Dưới đây là hai hướng đi chính.

  • Danh tính dựa trên địa chỉ IP: Theo cách tiếp cận này, chúng tôi giữ nguyên mọi thứ nhưng thay thế các địa chỉ IP hiện có bằng một phiên bản IP được băm lại. Điều này bảo vệ rất nhiều các quy trình công việc hiện có của chúng tôi nhưng không cung cấp bất kỳ lợi ích mới nào.
  • Danh tính dựa trên phiên truy cập: Theo cách tiếp cận này, chúng tôi tạo một danh tính cho những biên tập viên chưa đăng ký dựa trên cookie trình duyệt mà xác định trình duyệt trên thiết bị của họ. Cookie vẫn tồn tại ngay cả khi địa chỉ IP của họ thay đổi do đó phiên của họ không kết thúc.

Q: Làm cách nào mà hướng đidanh tính dựa trên địa chỉ IP hoạt động?

A: Hiện nay, các biên tập viên chưa đăng ký được xác định bằng địa chỉ IP của họ. Mô hình này đã hoạt động thành công trong các dự án của chúng tôi trong nhiều năm qua. Những người dùng thông thạo về địa chỉ IP hiểu rằng một địa chỉ IP duy nhất có thể được dùng bởi nhiều người dùng khác nhau dựa trên mức độ di động của địa chỉ IP đó. Điều này đúng với địa chỉ IP IPv6 hơn IPv4.

Người dùng chưa đăng ký cũng có thể thay đổi địa chỉ IP của mình nếu họ đang đi lại hoặc sửa đổi từ một vị trí khác.

Nếu chúng tôi vẫn theo đuổi giải pháp nhận dạng dựa trên IP bằng cách ẩn địa chỉ IP, chúng tôi vẫn sẽ duy trì cách thức hoạt động của địa chỉ IP ngày nay bằng cách chỉ cần che chúng bằng một số nhận dạng đã được mã hóa. Giải pháp này sẽ giữ các IP trở nên riêng biệt trong khi vẫn duy trì các quyền riêng tư của người dùng. Ví dụ: người dùng chưa đăng ký như Thành viên: có thể được xuất hiện dưới dạng Thành viên:ca1f46. Lợi ích của hướng đi này: Duy trì các quy trình và mô hình công việc hiện nay với sự gián đoạn tối thiểu

Bất lợi của hướng đi này: Không mang lại bất kỳ lợi ích nào trong một thế giới đang chuyển dịch nhanh chóng sang việc các địa chỉ IP trở nên chuyển dịch nhiều hơn/ít hữu ích hơn

Q: Làm cách nào mà hướng đidanh tính dựa trên phiên truy cập hoạt động?

A: Hướng đi là tạo một danh tính mới cho những người chỉnh sửa chưa đăng ký dựa trên một cookie được đặt trong trình duyệt của họ. Trong cách tiếp cận này, có một tên người dùng được tạo tự động mà các sửa đổi và hoạt động của họ được gán cho. Ví dụ: Thành viên: có thể được cấp tên người dùng: Thành viên:Anon3406.

Theo cách tiếp cận này, phiên của người dùng vẫn sẽ tồn tại miễn là họ có cookie truy cập, ngay cả khi họ thay đổi địa chỉ IP.

Lợi ích của hướng đi này:

  • Liên kết danh tính người dùng với một trình duyệt thiết bị, cung cấp một cách liên lạc lâu dài hơn với họ.
  • Danh tính người dùng không thay đổi khi chỉnh địa chỉ IP
  • Cách tiếp cận này có thể đưa cho ta một cách để những biên tập viên chưa đăng ký có quyền truy cập vào các tùy chọn nhất định mà hiện nay chỉ có sẵn cho người dùng đã đăng ký
  • This approach can offer a way for unregistered editors to convert to a permanent account while retaining their edit history

Drawbacks of this approach:

  • Significant change in the current model of what an unregistered editor represents
  • The identity for the unregistered editor only persists as long as the browser cookie does
  • Vandals in privacy mode or who delete their cookies would get a new identity without changing their IP
  • May require rethinking of some community workflows and tools

Q: Does the Foundation have a preferred path or approach?

A: Our preferred approach will be to go with the session-based identity as that will open up a lot of opportunities for the future. We could address communication issues we’ve had for twenty years. While someone could delete the cookie to get a new identity, the IP would still be visible to all active vandal fighters with the new user right. We do acknowledge that deleting a cookie is easier than switching an IP, of course, and do respect the effects it would have.

Proposal for sharing IP addresses with those who need access (10 June 2021)

Hi everyone. It has been a few months since our last update on this project. We have taken this time to talk to a lot of people — across the editing community and within the Foundation. We have put careful consideration towards weighing all the concerns raised in our discussions with experienced community members about the impact this will have on anti-vandalism efforts across our projects. We have also heard from a significant number of people who support this proposal as a step towards improving privacy of unregistered editors and reducing the legal threat that exposing IPs to the world poses to our projects.

When we talked about this project in the past, we did not have a clear idea of the shape this project will take. Our intention was to understand how IP addresses are helpful to our communities. We have since received a lot of feedback on this front from a number of conversations in different languages and in different communities. We are very grateful to all the community members who took the time to educate us about how moderation works on their wikis or in their specific cross-wiki environment.

We now have a more concrete proposal for this project that we hope will allow for most of the anti-vandalism work to happen undeterred while also restricting access to IP addresses from people who don’t need to see them.

I want to emphasize the word “proposal” because it is in no way, shape or form a final verdict on what will happen. Our intention is to seek your feedback about this idea – What do you think will work? What do you think won’t work? What other ideas can make this better?

We developed these ideas during several discussions with experienced community members, and we’ve refined them in collaboration with our Legal department. Here’s the outline:

  • Checkusers, stewards and admins should be able to see complete IP addresses by opting-in to a preference where they agree not to share it with others who don't have access to this information.
  • Editors who partake in anti-vandalism activities, as vetted by the community, can be granted a right to see IP addresses to continue their work. This could be handled in a similar manner as adminship on our projects. The community approval is important to ensure that only editors who truly need this access can get it. The editors will need to have an account that meets some threshold of time since registration (threshold is yet to be decided) and number of edits (number is yet to be decided).
  • All users with accounts that meet some threshold of time since registration (threshold is yet to be decided) and number of edits (number is yet to be decided) will be able to access partially unmasked IPs without permission. This means an IP address will appear with its tail octet(s) – the last part(s) – hidden. This will be accessible via a preference where they agree not to share it with others who don't have access to this information.
  • All other users will not be able to access IP addresses for unregistered users.

IP address access will be logged so that due scrutiny can be performed if and when needed. This is similar to the log we maintain for checkuser access to private data. This is how we hope to balance the need for privacy with the communities’ need to access information to fight spam, vandalism and harassment. We want to give the information to those who need it, but we need a process, we need it to be opt-in so that only those with an actual need will see it and we need the accesses to be logged.

We would like to hear your thoughts about this proposed approach. Please give us your feedback on the talk page.

  • What do you think will work?
  • What do you think won’t work?
  • What other ideas can make this better?

Data on Portuguese Wikipedia disabling IP edits

Update 02: Portuguese Wikipedia’s metrics (30 August 2021)

Hello. This is a brief update about Portuguese Wikipedia’s metrics since they started requiring registration to edit. We have a comprehensive report on the Impact report page. This report includes metrics captured through data as well as a survey that was conducted among active Portuguese Wikipedia contributors.

All in all, the report presents the change in a positive light. We have not seen any significant disruption over the time period these metrics have been captured. In light of this, we are now encouraged to run an experiment on two more projects to see if we observe similar impact. All projects are unique in their own ways and what holds true for Portuguese Wikipedia might not hold true for another project. We want to run a limited-time experiment on two projects where registration will be required in order to edit. We estimate that it will take approximately 8 months for us to collect enough data to see significant changes. After that time period, we will return to not requiring registration to edit while we analyse the data. Once the data is published, the community will be able to decide for themselves whether or not they want to continue to disallow unregistered editing on the project.

We are calling this the Login Required Experiment. You will find more detail as well as a timeline on that page. Please use that page and its talk page to discuss this further.

Update 01: Portuguese Wikipedia IP editing restriction

Portuguese Wikipedia banned unregistered editors from making edits to the project last year. Over the last few months, our team has been collecting data about the repercussions of this move on the general health of the project. We have also talked to several community members about their experience. We are working on the final bits to compile all the data that presents an accurate picture of the state of the project. We hope to have an update on this in the near future.


Update 02 on tool development

As you might already know, we are working on building some new tools, partly to soften the impact of IP Masking, but also just to build better anti-vandalism tools for everyone. It is not a secret that the state of moderation tools on our projects doesn’t give the communities the tools they deserve. There is a lot of scope for improvement. We want to build tools that make it easier for anti-vandalism fighters to work effectively. We also want to reduce the barrier to entry into these roles for non-technical contributors.

We have talked about ideas for these tools before and I will provide a brief update on these below. Note that progress on these tools has been slow in the last few months as our team is working on overhauling SecurePoll to meet the needs of the upcoming WMF Board elections.

IP Info feature

Mockup for IP Info

We are building a tool that will display important information about an IP address which is commonly sought in investigations. Typically patrollers, admins and checkusers rely on external websites to provide this information. We hope to make this process easier for them by integrating information from reliable IP-vendors within our websites. We recently built a prototype and conducted a round of user testing to validate our approach. We found that a majority of the editors in the interview set found the tool helpful and indicated they would like to use it in the future. There is an update on the project page that I would like to draw your attention to.

Key questions that we would like to have your feedback on the project talk page:

  • When investigating an IP what kinds of information do you look for? Which page are you likely on when looking for this information?
  • What kinds of IP information do you find most useful?
  • What kinds of IP information, when shared, do you think could put our anonymous editors at risk?

Editor matching feature

This project has also been referred to as "Nearby editors" and "Sockpuppet detection" in earlier conversations. We are trying to find a suitable name for it that is understandable even to people who don't understand the word sockpuppetry.

We are in the early stages of this project. Wikimedia Foundation Research has a project that could assist in detecting when two editors exhibit similar editing behaviors. This will help connect different unregistered editors when they edit under different auto-generated account usernames. We heard a lot of support for this project when we started talking about it a year ago. We also heard about the risks of developing such a feature. We are planning to build a prototype in the near term and share it with the community. There is a malnourished project page for this project. We hope to have an update for it soon. Your thoughts on this project are very welcome on the project talk page.

Update 01 on tools development

Like mentioned previously, our foremost goal is to provide better anti-vandalism tools for our communities which will provide a better moderation experience for our vandal fighters while also working towards making the IP address string less valuable for them. Another important reason to do this is that IP addresses are hard to understand and are really very useful only to tech-savvy users. This creates a barrier for new users without any technical background to enter into functionary roles as there is a higher learning curve for them to work with IP addresses. We hope to get to a place where we can have moderation tools that anyone can use without much prior knowledge.

The first thing we decided to focus on was to make the CheckUser tool more flexible, powerful and easy to use. It is an important tool that services the need to detect and block bad actors (especially long-term abusers) on a lot of our projects. The CheckUser tool was not very well maintained for many years and as a result it appeared quite dated and lacked necessary features.

We also anticipated an uptick in the number of users who opt-in to the role of becoming a CheckUser on our projects once IP Masking goes into effect. This reinforced the need for a better, easier CheckUser experience for our users. With that in mind, the Anti-Harassment Tools team spent the past year working on improving the CheckUser tool – making it much more efficient and user-friendly. This work has also taken into account a lot of outstanding feature requests by the community. We have continually consulted with CheckUsers and stewards over the course of this project and have tried our best to deliver on their expectations. The new feature is set to go live on all projects in October 2020.

The next feature that we are working on is IP info. We decided on this project after a round of consultation on six wikis which helped us narrow down the use cases for IP addresses on our projects. It became apparent early on that there are some critical pieces of information that IP addresses provide which need to be made available for patrollers to be able to do their roles effectively. The goal for IP Info, thus, is to quickly and easily surface significant information about an IP address. IP addresses provide important information such as location, organization, possibility of being a Tor/VPN node, rDNS, listed range, to mention a few examples. By being able to show this, quickly and easily without the need for external tools everyone can’t use, we hope to be able to make it easier for patrollers to do their job. The information provided is high-level enough that we can show it without endangering the anonymous user. At the same time, it is enough information for patrollers to be able to make quality judgements about an IP address.

After IP Info we will be focusing on a finding similar editors feature. We’ll be using a machine learning model, built in collaboration with CheckUsers and trained on historical CheckUser data to compare user behavior and flag when two or more users appear to be behaving very similarly. The model will take into account which pages users are active on, their writing styles, editing times etc. to make predictions about how similar two users are. We are doing our due diligence in making sure the model is as accurate as possible.

Once it’s ready, there is a lot of scope for what such a model can do. As a first step we will be launching it to help CheckUsers detect socks easily without having to perform a lot of manual labor. In the future, we can think about how we can expose this tool to more people and apply it to detect malicious sockpuppeting rings and disinformation campaigns.

You can read more and leave comments on our project page for tools.


IP masking impact report

IP addresses are valuable as a semi-reliable partial identifier, which is not easily manipulated by their associated user. Depending on provider and device configuration, IP address information is not always accurate or precise, and deep technical knowledge and fluency is needed to make best use of IP address information, though administrators are not currently required to demonstrate such fluency to have access. This technical information is used to support additional information (referred to as “behavioural knowledge”) where possible, and the information taken from IP addresses significantly impact the course of administrative action taken.

A Wikimedia Foundation-supported report on the impact that IP masking will have on our community.

On the social side, the issue of whether to allow unregistered users to edit has been a subject of extensive debate. So far, it has erred on the side of allowing unregistered users to edit. The debate is generally framed around a desire to halt vandalism, versus preserving the ability for pseudo-anonymous editing and lowering the barrier to edit. There is a perception of bias against unregistered users because of their association with vandalism, which also appears as algorithmic bias in tools such as ORES. Additionally, there are major communications issues when trying to talk to unregistered users, largely due to lack of notifications, and because there is no guarantee that the same person will be reading the messages sent to that IP talk page.

In terms of the potential impact of IP masking, it will significantly impact administrator workflows and may increase the burden on CheckUsers in the short term. If or when IP addresses are masked, we should expect our administrators' ability to manage vandalism to be greatly hindered. This can be mitigated by providing tools with equivalent or greater functionality, but we should expect a transitional period marked by reduced administrator efficacy. In order to provide proper tool support for our administrators’ work, we must be careful to preserve or provide alternatives to the following functions currently fulfilled by IP information:

  • Block efficacy and collateral estimation
  • Some way of surfacing similarities or patterns among unregistered users, such as geographic similarity, certain institutions (e.g. if edits are coming from a high school or university)
  • The ability to target specific groups of unregistered users, such as vandals jumping IPs within a specific range
  • Location or institution-specific actions (not necessarily blocks); for example, the ability to determine if edits are made from an open proxy, or public location like a school or public library.

Depending on how we handle temporary accounts or identifiers for unregistered users, we may be able to improve communication to unregistered users. Underlying discussions and concerns around unregistered editing, anonymous vandalism, and bias against unregistered users are unlikely to significantly change if we mask IPs, provided we maintain the ability to edit projects while logged out.

CheckUser workflow

We interviewed CheckUsers on multiple projects throughout our process for designing the new Special:Investigate tool. Based on interviews and walkthroughs of real-life cases, we broke down the general CheckUser workflow into five sections:

  • Triaging: assessing cases for feasibility and complexity.
  • Profiling: creating a pattern of behaviour which will identify the user behind multiple accounts.
  • Checking: examining IPs and useragents using the CheckUser tool.
  • Judgement: matching this technical information against the behavioural information established in the Profiling step, in order to make a final decision about what kind of administrative action to take.
  • Closing: reporting the outcome of the investigation on public and private platforms where necessary, and appropriately archiving information for future use.

We also worked with staff from Trust and Safety to get a sense for how the CheckUser tool factors into Wikimedia Foundation investigations and cases that are escalated to T&S.

The most common and obvious pain points all revolved around the CheckUser tool's unintuitive information presentation, and the need to open up every single link in a new tab. This caused massive confusion as tab proliferation quickly got out of hand. To make matters worse, the information that CheckUser surfaces is highly technical and not easy to understand at first glance, making the tabs difficult to track. All of our interviewees said that they resorted to separate software or physical pen and paper in order to keep track of information.

We also ran some basic analyses of English Wikipedia's Sockpuppet Investigations page to get some baseline metrics on how many cases they process, how many are rejected, and how many sockpuppets a given report contains.

Patroller use of IP addresses

Previous research on patrolling on our projects has generally focused on the workload or workflow of patrollers. Most recently, the Patrolling on Wikipedia study focuses on the workflows of patrollers and identifying potential threats to current anti-vandal practices. Older studies, such as the New Page Patrol survey and the Patroller work load study, focused on English Wikipedia. They also look solely at the workload of patrollers, and more specifically on how bot patrolling tools have affected patroller workloads.

Our study tried to recruit from five target wikis, which were

  • Japanese Wikipedia
  • Dutch Wikipedia
  • German Wikipedia
  • Chinese Wikipedia
  • English Wikiquote

They were selected for known attitudes towards IP edits, percentage of monthly edits made by IPs, and any other unique or unusual circumstances faced by IP editors (namely, use of the Pending Changes feature and widespread use of proxies). Participants were recruited via open calls on Village Pumps or the local equivalent. Where possible, we also posted on Wiki Embassy pages. Unfortunately, while we had interpretation support for the interviews themselves, we did not extend translation support to the messages, which may have accounted for low response rates. All interviews were conducted via Zoom, with a note-taker in attendance.

Supporting the findings from previous studies, we did not find a systematic or unified use of IP information. Additionally, this information was only sought out after a certain threshold of suspicion. Most further investigation of suspicious user activity begins with publicly available on-wiki information, such as checking previous local edits, Global Contributions, or looking for previous bans.

Precision and accuracy were less important qualities for IP information: upon seeing that one chosen IP information site returned three different results for the geographical location of the same IP address, one of our interviewees mentioned that precision in location was not as important as consistency. That is to say, so long as an IP address was consistently exposed as being from one country, it mattered less if it was correct or precise. This fits with our understanding of how IP address information is used: as a semi-unique piece of information associated with a single device or person, that is relatively hard to spoof for the average person. The accuracy or precision of the information attached to the user is less important than the fact that it is attached and difficult to change.

Our findings highlight a few key design aspects for the IP info tool:

  • Provide at-a-glance conclusions over raw data
  • Cover key aspects of IP information:
    • Geolocation (to a city or district level where possible)
    • Registered organization
    • Connection type (high-traffic, such as data center or mobile network versus low-traffic, such as residential broadband)
    • Proxy status as binary yes or no

As an ethical point, it will be important to be able to explain how any conclusions are reached, and the inaccuracy or imprecisions inherent in pulling IP information. While this was not a major concern for the patrollers we talked to, if we are to create a tool that will be used to provide justifications for administrative action, we should be careful to make it clear what the limitations of our tools are.

Statements from the Wikimedia Foundation Legal department

Legal Update 02 (July 2021)

First of all, we’d like to thank everyone for participating in these discussions. We appreciate the attention to detail, the careful consideration, and the time that has gone into engaging in this conversation, raising questions and concerns, and suggesting ways that the introduction of masked IPs can be successful. Today, we’d like to explain in a bit more detail how this project came about and the risks that inspired this work, answer some of the questions that have been raised so far, and briefly talk about next steps.


To explain how we arrived here, we’d like to briefly look backwards. Wikipedia and its sibling projects were built to last. Sharing the sum of all knowledge isn’t something that can be done in a year, or ten years, or any of our lifetimes. But while the mission of the communities and Foundation was created for the long term, the technical and governance structures that enable that mission were very much of the time they were designed. Many of these features have endured, and thrived, as the context in which they operate has changed. Over the last 20 years, a lot has evolved: the way societies use and relate to the internet, the regulations and policies that impact how online platforms run as well as the expectations that users have for how a website will handle their data.

In the past five years in particular, users and governments have become more and more concerned about online privacy and the collection, storage, handling, and sharing of personal data. In many ways, the projects were ahead of the rest of the internet: privacy and anonymity are key to users’ ability to share and consume free knowledge. The Foundation has long collected little information about users, not required an email address for registration, and recognized that IP addresses are personal data (see, for example, the 2014–2018 version of our Privacy policy). More recently, the conversation about privacy has begun to shift, inspiring new laws and best practices: the European Union’s General Data Protection Regulation, which went into effect in May 2018, has set the tone for a global dialogue about personal data and what rights individuals should have to understand and control its use. In the last few years, data protection laws around the world have been changing—look at the range of conversations, draft bills, and new laws in, for example, Brazil, India, Japan, or the United States.

Legal risks

The Foundation’s Privacy team is consistently monitoring this conversation, assessing our practices, and planning for the future. It is our job to look at the projects of today, and evaluate how we can help prepare them to operate within the legal and societal frameworks of tomorrow. A few years ago, as part of this work, we assessed that the current system of publishing IP addresses of non-logged-in contributors should change. We believe it creates risk to users whose information is published in this way. Many do not expect it—even with the notices explaining how attribution works on the projects, the Privacy team often hears from users who have made an edit and are surprised to see their IP address on the history page. Some of them are in locations where the projects are controversial, and they worry that the exposure of their IP address may allow their government to target them. The legal frameworks that we foresaw are in operation, and the publication of these IP addresses pose real risks to the projects and users today.

We’ve heard from several of you that you want to understand more deeply what the legal risks are that inspired this project, whether the Foundation is currently facing legal action, what consequences we think might result if we do not mask IP addresses, etc. (many of these questions have been collected in the expanded list at the end of this section). We’re sorry that we can’t provide more information, since we need to keep some details of the risks privileged. “Privileged” means that a lawyer must keep something confidential, because revealing it could cause harm to their client. That’s why privilege is rarely waived; it’s a formal concept in the legal systems of multiple countries, and it exists for very practical reasons—to protect the client. Here, waiving the privilege and revealing this information could harm the projects and the Foundation. Generally, the Legal Affairs team works to be as transparent as possible; however, an important part of our legal strategy is to approach each problem on a case by case basis. If we publicly discuss privileged information about what specific arguments might be made, or what risks we think are most likely to result in litigation, that could create a road map by which someone could seek to harm the projects and the communities.

That said, we have examined this risk from several angles, taking into account the legal and policy situation in various countries around the world, as well as concerns and oversight requests from users whose IP addresses have been published, and we concluded that IP addresses of non-logged-in users should no longer be publicly visible, largely because they can be associated with a single user or device, and therefore could be used to identify and locate non-logged-in users and link them with their on-wiki activity.

Despite these concerns, we also understood that IP addresses play a major part in the protection of the projects, allowing users to fight vandalism and abuse. We knew that this was a question we’d need to tackle holistically. That’s why a working group from different parts of the Wikimedia Foundation was assembled to examine this question and make a recommendation to senior leadership. When the decision was taken to proceed with IP masking, we all understood that we needed to do this with the communities—that only by taking your observations and ideas into account would we be able to successfully move through this transition.

I want to emphasize that even when IP addresses are masked and new tools are in place to support your anti-vandalism work, this project will not simply end. It’s going to be an iterative process—we will want feedback from you as to what works and what doesn’t, so that the new tools can be improved and adapted to fit your needs.


Over the past months, you’ve had questions, and often, we’ve been unable to provide the level of detail you’re hoping for in our answers, particularly around legal issues.

Q: What specific legal risks are you worried about?

A: We cannot provide details about the individual legal risks that we are evaluating. We realize it’s frustrating to ask why and simply get, “that’s privileged” as an answer. And we’re sorry that we cannot provide more specifics, but as explained above, we do need to keep the details of our risk assessment, and the potential legal issues we see on the horizon, confidential, because providing those details could help someone figure out how to harm the projects, communities, and Foundation.

There are settled answers to some questions.

Q: Is this project proceeding?

A: Yes, we are moving forward with finding and executing on the best way to hide IP addresses of non-logged-in contributors, while preserving the communities’ ability to protect the projects.

Q: Can this change be rolled out differently by location?

A: No. We strive to protect the privacy of all users to the same standard; this will change across the Wikimedia projects.

Q: If other information about non-logged-in contributors is revealed (such as location, or ISP), then it doesn’t matter if the IP address is also published, right?

A: That’s not quite the case. In the new system, the information we make available will be general information that is not linked to an individual person or device—for example, providing a city-level location, or noting that an edit was made by someone at a particular university. While this is still information about the user, it’s less specific and individual than an IP address. So even though we are making some information available in order to assist with abuse prevention, we are doing a better job of protecting the privacy of that specific contributor.

Q: If we tell someone their IP address will be published, isn’t that enough?

A: No. As mentioned above, many people have been confused to see their IP address published. Additionally, even when someone does see the notice, the Foundation has legal responsibilities to properly handle their personal data. We have concluded that we should not publish the IP addresses of non-logged-in contributors because it falls short of current privacy best practices, and because of the risks it creates, including risks to those users.

Q: How will masking impact CC-BY-SA attribution?

A: IP masking will not affect CC license attribution on Wikipedia. The 3.0 license for text on the Wikimedia projects already states that attribution should include “​​the name of the Original Author (or pseudonym, if applicable)” (see the license at section 4c) and use of an IP masking structure rather than an IP address functions equally well as a pseudonym. IP addresses already may vary or be assigned to different people over time, so using that as a proxy for un-registered editors is not different in quality from an IP masking structure and both satisfy the license pseudonym structure. In addition, our Terms of use section 7 specify that as part of contributing to Wikipedia, editors agree that links to articles (which include article history) are a sufficient method of attribution.

And sometimes, we don’t know the answer to a question yet, because we’d like to work with you to find the solution.

Q: What should the specific qualifications be for someone to apply for this new user right?

A: There will be an age limit; we have not made a definitive decision about the limit yet, but it’s likely they will need to be at least 16 years old. Additionally, they should be active, established community members in good standing. We’d like to work through what that means with you.

I see that the team preparing these changes is proposing to create a new userright for users to have access to the IP addresses behind a mask. Does Legal have an opinion on whether access to the full IP address associated with a particular username mask constitutes nonpublic personal information as defined by the Confidentiality agreement for nonpublic information, and will users seeking this new userright be required to sign the Access to nonpublic personal data policy or some version of it?
1 If yes, then will I as a checkuser be able to discuss relationships between registered accounts and their IP addresses with holders of this new userright, as I currently do with other signatories?
2 If no, then could someone try to explain why we are going to all this trouble for information that we don't consider nonpublic?
3 In either case, will a checkuser be permitted to disclose connections between registered accounts and unregistered username masks?

A: This is a great question. The answer is partially yes. First, yes, anyone who has access to the right will need to acknowledge in some way that they are accessing this information for the purposes of fighting vandalism and abuse on the projects. We are working on how this acknowledgement will be made; the process to gain access is likely to be something less complex than signing the access to non-public personal data agreement.

As to how this would impact CUs, right now, the access to non-public personal data policy allows users with access to non-public personal data to share that data with other users who are also able to view it. So a CU can share data with other CUs in order to carry out their work. Here, we are maintaining a distinction between logged-in and logged-out users, so a CU would not be able to share IP addresses of logged-in users with users who have this new right, because users with the new right would not have access to such information.

Presuming that the CU also opts in to see IP addresses of non-logged-in users, under the current scheme, that CU would be able to share IP address information demonstrating connections between logged-in users and non-logged-in users who had been masked with other CUs who had also opted in. They could also indicate to users with the new right that they detected connections between logged-in and non-logged-in users. However, the CU could not directly the share IP addresses of the logged-in users with non-CU users who only have the new right.

Please let us know if this sounds unworkable. As mentioned above, we are figuring out the details, and want to get your feedback to make sure it works.

Next steps

Over the next few months, we will be rolling out more detailed plans and prototypes for the tools we are building or planning to build. We’ll want to get your feedback on these new tools that will help protect the projects. We’ll continue to try to answer your questions when we can, and seek your thoughts when we should arrive at the answer together. With your feedback, we can create a plan that will allow us to better protect non-logged-in editors’ personal data, while not sacrificing the protection of Wikimedia users or sites. We appreciate your ideas, your questions, and your engagement with this project.

Legal Update 01 (October 2020)

This statement from the Wikimedia Foundation Legal department was written on request for the talk page and comes from that context. For visibility, we wanted you to be able to read it here too.

Hello All. This is a note from the Legal Affairs team. First, we’d like to thank everyone for their thoughtful comments. Please understand that sometimes, as lawyers, we can’t publicly share all of the details of our thinking; but we read your comments and perspectives, and they’re very helpful for us in advising the Foundation.

On some occasions, we need to keep specifics of our work or our advice to the organization confidential, due to the rules of legal ethics and legal privilege that control how lawyers must handle information about the work they do. We realize that our inability to spell out precisely what we’re thinking and why we might or might not do something can be frustrating in some instances, including this one. Although we can’t always disclose the details, we can confirm that our overall goals are to do the best we can to protect the projects and the communities at the same time as we ensure that the Foundation follows applicable law.

Within the Legal Affairs team, the privacy group focuses on ensuring that the Foundation-hosted sites and our data collection and handling practices are in line with relevant law, with our own privacy-related policies, and with our privacy values. We believe that individual privacy for contributors and readers is necessary to enable the creation, sharing, and consumption of free knowledge worldwide. As part of that work, we look first at applicable law, further informed by a mosaic of user questions, concerns, and requests, public policy concerns, organizational policies, and industry best practices to help steer privacy-related work at the Foundation. We take these inputs, and we design a legal strategy for the Foundation that guides our approach to privacy and related issues. In this particular case, careful consideration of these factors has led us to this effort to mask IPs of non-logged-in editors from exposure to all visitors to the Wikimedia projects. We can’t spell out the precise details of our deliberations, or the internal discussions and analyses that lay behind this decision, for the reasons discussed above regarding legal ethics and privilege.

We want to emphasize that the specifics of how we do this are flexible; we are looking for the best way to achieve this goal in line with supporting community needs. There are several potential options on the table, and we want to make sure that we find the implementation in partnership with you. We realize that you may have more questions, and we want to be clear upfront that in this dialogue we may not be able to answer the ones that have legal aspects. Thank you to everyone who has taken the time to consider this work and provide your opinions, concerns, and ideas.

Best regards,
Anti-Harassment Tools Team

Please use the talk page for discussions on the matter. For any issues concerning this release, please don't hesitate to contact Niharika Kohli, Product Manager – niharika wikimedia.org and cc Sandister Tei, Community Relations Specialist – stei wikimedia.org or leave a message on the talk page.

For more information or documentation on IP editing, masking and an overview of what has been done so far including community discussions, please see the links below.