הגברת הפרטיות ומניעת שימוש לרעה בכתובות IP
מהו מיסוך כתובות IP ולמה קרן ויקימדיה ממסכת כתובות IP?
מיסוך כתובות IP מחביא את כתובות ה־IP של עורכים לא רשומים במיזמים של ויקימדיה, באופן מלא או חלקי, מכל אדם חוץ מאלה הזקוקים לגישה כדי להילחם בספאם, השחתה, הטרדה והפצת מידע כוזב.
נכון לעכשיו, כל אחד יכול לערוך אתרי ויקי של ויקימדיה ללא חשבון ויקימדיה או מבלי להיכנס. MediaWiki, התוכנה שמאחורי מיזמים של ויקימדיה, תקליט ותפרסם את כתובת ה־IP שלך ביומן הציבורי שלה. כל מי שמחפש את כתובת ה־IP שלך ימצא אותה.
למיזמי ויקימדיה יש סיבה טובה לאחסן ולפרסם כתובות IP: הם ממלאים תפקיד חיוני בהרחקת השחתה והטרדה מחוץ לאתרי הוויקי שלנו.
עם זאת, כתובת ה־IP שלך יכולה לספר מאיזה מקום ערכת וניתן להשתמש בה כדי לזהות אותך או את המכשיר שלך. זה מדאיג במיוחד אם העריכות שלך נעשות ממקום שבו אתרי הוויקי שלנו שנויים במחלוקת. פרסום כתובת ה־IP שלך עשוי לאפשר לאחרים לאתר אותך.
עם שינויים בחוקים ובתקנים הנוגעים לפרטיות (למשל, האסדרה הכללית להגנה על מידע והדיון העולמי על פרטיות שהחל בגללה), הצוות המשפטי של קרן ויקימדיה החליט להגן על פרטיות המשתמשים על־ידי החבאת כתובות IP מהציבור הרחב. עם זאת, אנו נמשיך לתת גישה למשתמשים שצריכים לראות את הכתובות כדי להגן על אתרי הוויקי.
אנו מודעים לכך שהשינוי זה ישפיע על תהליכי העבודה הנוכחיים של האנשים שפועלים נגד שימוש לרעה. אנו מחויבים לפתח כלים או לשמור על גישה לכלים שיכולים לזהות ולחסום משחיתים, בובות קש, עורכים עם ניגודי עניינים ושחקנים מזיקים אחרים לאחר שכתובות ה־IP ימוסכו.
April 2023: The Plan for IP Masking
As promised, here's an update about how IP Masking would work.
It will cover the changes for both unregistered and registered editors. We want to acknowledge at the outset that we still have lots of open questions and things we have not decided upon. This is our initial plan and does not cover everything we aim to do during this project. As we are proceeding we are discovering new pieces of previously unforeseen work.
Your feedback will help us understand what more we can do to make IP Masking easier on our communities.
This update is an FAQ format as that makes the upcoming changes clear and understandable.
What does IP Masking change from the perspective of a non-logged-in editor?
Currently, before a non-logged-in user completes an edit, they are informed that their edits will be attributed to their IP address.
In the future, before a non-logged-in user completes an edit, they will be informed that their edits will be attributed to a temporary account. Its name will be a number, incrementing for each new account. The account will be tied to a cookie that lives in the user's browser. As long as that cookie exists, the user will keep the same temporary account, and all their edits will be attributed to that account. The IP addresses of the user may change, but the temporary account will not change as long as the cookie exists. A temporary account generated on one wiki will also work on other wikis that the user may contribute to.
What will temporary usernames look like?
We don't know yet.
Our initial mockups considered using an asterisk as a prefix followed by an auto-incrementing number. (Example:
*12345.) You will find these mockups below.
But as some volunteers pointed out, the asterisk is not a good choice because of an outstanding MediaWiki bug.
We are discussing different prefix options and will be conducting user tests with these.
Our current top candidates (in no particular order) are:
- Caret (
- Hyphen (
- Tilde (
- Exclamation mark (
- Question mark (
- Year prefix –
Do any of these strike you as a great or a terrible choice? Please add your comments either on the talk page or Phabricator.
- ↑ (While the question mark is a great sign for something unknown and is widely understood, there are details we're still figuring out. For example, it'll need to be encoded into the URL using
%3F. This URL encoding shouldn't be a problem, but would be a hiccup for users who are used to typing in URLs by hand.)
How long do temporary usernames persist for?
Some time after the first edit (tentatively one year) or as a result of clearing the user's cache, the cookie will automatically expire.
Existing edits will still be attributed to it, though.
After the old username expires, if the user edits again in the future, they will be granted a new temporary account.
What does IP Masking change from the perspective of a patroller?
Limited IP address exposure
The biggest change is that IP addresses will no longer be visible to the general public.
Anyone who does not have an account or does not meet the required thresholds for IP address access (see Legal's update) will not be able to see IP addresses. To mitigate the impact on patrolling, we will be releasing improvements to IP Info Feature.
This will include data from the Spur service.
Obtaining access to IP addresses
Together with the Foundation's Legal department, we have developed new guidelines.
These define who will be able to access IP addresses and how. Users who meet the requirements will be able to opt-in to reveal IP addresses through Special:Preferences. See how the reveal functionality will work in detail.
This access and reveal will be logged and will be available to a limited group of users (CheckUsers, stewards, Trust & Safety).
Better communication channels with temporary editors
Temporary accounts will be linked to a browser cookie.
As long as the cookie persists, the user's edits will be attributed to the same temporary account. Temporary account holders will also be able to receive talk page notifications just like registered users. We hope this will allow for better communication with temporary users. It may also resolve some long-standing issues raised by the communities (see T278838).
Documenting IP addresses for vandals
It will be possible to document IP addresses for bad actors publicly through long-term abuse pages, as currently.
However, care should be taken to not expose IP addresses for other temporary users. When discussing possible bad actors, tools like suppression should be used if the user is not found to be a vandal as suspected.
More details about this can be found in the guidelines.
Tools available for patrolling
Like IP editors, temporary users can be checked and patrolled through Special:Block, Special:Checkuser and Special:Investigate.
Additionally, IP Info Feature can be used to access information about the underlying IP address for the given revision.
We are developing guidelines for Cloud tools and bots to access IPs for patrolling.
We will have an update for this soon.
What happens to existing IP addresses on our sites?
Existing IP addresses that are already recorded on our wikis will remain untouched.
Edits that come in after IP Masking will be attributed to temporary usernames.
Since we will roll out IP Masking gradually, this will mean that this change will happen on different wikis at different times.
How will the IP address reveal functionality work?
Users who can access IP addresses will be able to expose IP addresses for temporary accounts.
Mockups for how this functionality would work:
What will happen to tools and bots that rely on IP addresses to function?
We are working to understand the impact to volunteer-maintained tools.
This is a task for our team as well as the Research and Engineering teams. Next, we will work with Legal to understand which tools may continue to access IP addresses and the guidelines for how they can operate.
We will provide an update on this page once we have a plan of action.
We plan to test IP Masking slowly, to include ample time for communities' feedback and testing.
We want our rollouts not to hinder communities' processes. Our another priority is to avoid undesirable outcomes for the health of the communities. We have implemented metrics that we plan to watch as we roll out the changes.
We are looking for communities that would be candidates for testing launch (piloting) of IP Masking. We are considering criteria such as number of IP edits the communities receive, urgency of anti-vandalism work, size of the project, and potential for disruption. We will have another update on this page about our chosen candidates closer to the launch of IP Masking. If you'd like your community to test the launch of IP Masking, please make a decision as a community and let us know on the talk page.
נתונים לגבי מניעת עריכה ממשתמשים לא רשומים בויקיפדיה הפורטוגזית
Portuguese Wikipedia’s metrics following restriction
שלום. זהו עדכון קצר לגבי המדדים שנאספו מוויקיפדיה הפורטוגזית מאז שהם החלו לדרוש הרשמה לשם עריכה. יש לנו דוח מקיף בעמוד זה. הדוח כולל מדדים שנאספו באמצעות נתונים, וכן סקר שנעשה בקרב משתמשים שתורמים לוויקיפדיה הפורטוגזית. באופן כללי, הדוח מציג את השינויים באור חיובי. לא ראינו בעיות משמעותיות במהלך הזמן שמדדים אלו נאספו. אנחנו מוצאים בכך עידוד לעריכת ניסויים בשני מיזמים נוספים, כדי לראות אם ההשפעה על מיזמים אלו זהה. כל מיזם של ויקימדיה הוא מיוחד בדרכו ומה שנכון לגבי ויקיפדיה הפורטוגזית עשוי לא להיות נכון למיזמים אחרים. אנחנו רוצים לערוך ניסוי לזמן מוגבל על שני מיזמים שבהם תידרש הרשמה לצורך עריכה. אנחנו מעריכים שיקח בערך 8 חודשים לאסוף לאסוף מספיק נתונים ולראות שינויים משמעותיים. אחרי תקופה זו, נפסיק לדרוש הרשמה לצורך עריכה, בזמן שאנחנו מנתחים את הנתונים. ברגע שהנתונים יפורסמו, חברי הקהילות יהיו יכולים להחליט בעצמם אם הם רוצים להמשיך לא לאפשר למשתמשים שאינם רשומים לערוך או לא.
All in all, the report presents the change in a positive light. We have not seen any significant disruption over the time period these metrics have been captured. In light of this, we are now encouraged to run an experiment on two more projects to see if we observe similar impact. All projects are unique in their own ways and what holds true for Portuguese Wikipedia might not hold true for another project. We want to run a limited-time experiment on two projects where registration will be required in order to edit. We estimate that it will take approximately 8 months for us to collect enough data to see significant changes. After that time period, we will return to not requiring registration to edit while we analyse the data. Once the data is published, the community will be able to decide for themselves whether or not they want to continue to disallow unregistered editing on the project.
אנחנו קוראים לזה ניסוי חובת הרשמה. אתם יכולים למצוא פרטים נוספים וכן לוח זמנים בדף של הניסוי. אנא השתמשו בדף זה ובדף השיחה שלו כדי להמשיך את הדיון.
Portuguese Wikipedia IP editing restriction
דיברנו על כלים אלו בעבר, ואני אספק למטה עדכון קצר לגביהם. שימו לב שההתקדמות בפיתוח כלים אלו הייתה איטית בחודשים האחרונים, כי הצוותים שלנו היו עסוקים בשדרוג של SecurePoll כדי להתאים אותו לצרכים של הבחירות למועצת הקרן.
כלי למידע על IP
אנחנו בונים כלי שיאפשר הצגה של מידע חשוב על כתובת ה-IP, שבמקרים רבים יש בו צורך במהלך חקירה. בדרך כלל, מפעילים, מנטרים ובודקים מסתמכים על אתרים חיצוניים כדי לספק את המידע הזה. אנחנו מקווים להקל על עבודתם על ידי אינטגרציה של מידע מספקי IP אמינים בתוך האתרים שלנו. לאחרונה בנינו אב טיפוס וערכנו סבב של בדיקות על ידי משתמשים כדי לתקף את גישתנו. רוב העורכים שרואיינו חשבו שהכלי סייע להם וציינו שהם יהיו מעוניינים להשתמש בו בעתיד. אנחנו רוצים להסב את תשומת לבכם לעדכון בדף הפרויקט. שאלות מפתח שלגביהן אנחנו רוצים את הפידבק שלכם בדף השיחה:
- כשאתם חוקרים IP איזה סוג של מידע אתם מחפשים? באיזה דף אתם בדרך כלל משתמשים כשאתם מחפשים את המידע הזה?
- אלו סוגים של מידע על IP הוא הכי שימושי עבורכם?
- אלו סוגים של מידע על IP עשוי להעמיד את העורכים האנונימים בסכנה כאשר משתפים אותו?
כלי להתאמת עורכים
כלי זה כונה בשיחות קודמות גם "עורכים קרובים" ו"איתור בובות קש". אנחנו מנסים למצוא שם מתאים שיהיה מובן גם לאנשים שלא מבינים את הביטויים "בובת קש" או "בובת גרב".
אנחנו נמצאים בשלב מוקדם של המיזם הזה. לקרן ויקימדיה יש מיזם שיכול לעזור בזיהוי שני עורכים בעלי התנהגות דומה. מיזם זה יכול לעזור בקישור בין שני משתמשים לא רשומים, כאשר הם עורכים תחת שני שמות משתמש שנוצרים באופן אוטומטי. המיזם קיבל הרבה תמיכה כשהתחלנו לדבר עליו לפני שנים. שמענו גם על הסיכונים שכרוכים בפיתוח כלי כזה. אנחנו מתכננים לבנות אב טיפוס בזמן הקרוב ולשתף אותו עם הקהילה. יש למיזם הזה דף שלא זוכה לתשומת לב מספקת. אנחנו מקווים שהוא יעודכן בקרוב. נשמח לשמוע את המחשבות שלכם לגבי המיזם הזה בדף השיחה של המיזם.
הדבר הראשון שבו החלטנו להתמקד היה יצירה של כלי לבודקים שיהיה גמיש יותר, רב-עוצמה וקל לשימוש. זה כלי חשוב שעונה על הצורך לאתר ולחסום פעילות שאינה על פי הכללים (במיוחד, שימוש ארוך טווח לרעה) ברבים מהמיזמים שלנו. כתוצאה מתחזוקה לא מספקת במשך שנים רבות, כלי הבדיקה היה מיושן וחסרו בו מרכיבים חיוניים.
צפינו גם שעם המעבר להסוואה של כתובות IP, תהיה עלייה במספר המשתמשים שמבקשים להיכנס לתפקיד של בודקים במיזמים השונים. ציפייה זו חיזקה את הצורך בכלי טוב יותר, קל יותר לבודקים. בהתחשב בזה, הצוות שמפתח כלים למניעת הטרדות הקדיש את השנה האחרונה לשיפור כלי הבדיקה - להפוך אותו להרבה יותר ידידותי למשתמש ויעיל. העבודה הזו כללה תשומת לב לבקשות נוספות לכלי שהגיעו מהקהילה. אנחנו התייעצנו באופן רציף עם בודקים ודיילים במהלך הפרויקט הזה וניסינו ככל יכולתנו לענות על הציפיות שלהם. הכלי החדש אמור להיכנס לשימוש בכל המיזמים באוקטובר 2020.
התכונה הבאה שאנחנו עובדים עליה היא מידע על IP (באנגלית: IP info). החלטנו על הפרויקט הזה לאחר סבב התייעצות בשישה אתרי ויקי שעזרו לנו לצמצם את מקרי השימוש בכתובות IP במיזמים שלנו. התברר בשלב מוקדם שיש כמה פיסות מידע קריטיות שכתובות IP מספקות, ושצריכות להיות זמינות למנטרים כדי שיוכלו לבצע את תפקידם ביעילות. המטרה של "מידע על IP", אם כן, היא להציג במהירות ובקלות מידע משמעותי על כתובת IP. כתובות IP מספקות מידע חשוב כגון מיקום, ארגון, אפשרות להיות צומת Tor/VPN, נתוני rDNS, וטווח רשום, אם להזכיר כמה דוגמאות. על־ידי היכולת להראות זאת, במהירות ובקלות, ללא צורך בכלים חיצוניים שלא כולם יכולים להשתמש בהם, אנו מקווים שנוכל להקל על המנטרים לבצע את עבודתם. המידע המסופק ברמה גבוהה מספיק כדי שנוכל להציג אותו מבלי לסכן את המשתמש האלמוני. יחד עם זאת, זה מספיק מידע כדי שמנטרים יוכלו לעשות שיפוט איכותי לגבי כתובת IP.
After IP Info we will be focusing on a finding similar editors feature. We’ll be using a machine learning model, built in collaboration with CheckUsers and trained on historical CheckUser data to compare user behavior and flag when two or more users appear to be behaving very similarly. The model will take into account which pages users are active on, their writing styles, editing times etc. to make predictions about how similar two users are. We are doing our due diligence in making sure the model is as accurate as possible.
Once it’s ready, there is a lot of scope for what such a model can do. As a first step we will be launching it to help CheckUsers detect socks easily without having to perform a lot of manual labor. In the future, we can think about how we can expose this tool to more people and apply it to detect malicious sockpuppeting rings and disinformation campaigns.
You can read more and leave comments on our project page for tools.
IP addresses are valuable as a semi-reliable partial identifier, which is not easily manipulated by their associated user. Depending on provider and device configuration, IP address information is not always accurate or precise, and deep technical knowledge and fluency is needed to make best use of IP address information, though administrators are not currently required to demonstrate such fluency to have access. This technical information is used to support additional information (referred to as “behavioural knowledge”) where possible, and the information taken from IP addresses significantly impact the course of administrative action taken.
On the social side, the issue of whether to allow unregistered users to edit has been a subject of extensive debate. So far, it has erred on the side of allowing unregistered users to edit. The debate is generally framed around a desire to halt vandalism, versus preserving the ability for pseudo-anonymous editing and lowering the barrier to edit. There is a perception of bias against unregistered users because of their association with vandalism, which also appears as algorithmic bias in tools such as ORES. Additionally, there are major communications issues when trying to talk to unregistered users, largely due to lack of notifications, and because there is no guarantee that the same person will be reading the messages sent to that IP talk page.
In terms of the potential impact of IP masking, it will significantly impact administrator workflows and may increase the burden on CheckUsers in the short term. If or when IP addresses are masked, we should expect our administrators' ability to manage vandalism to be greatly hindered. This can be mitigated by providing tools with equivalent or greater functionality, but we should expect a transitional period marked by reduced administrator efficacy. In order to provide proper tool support for our administrators’ work, we must be careful to preserve or provide alternatives to the following functions currently fulfilled by IP information:
- Block efficacy and collateral estimation
- Some way of surfacing similarities or patterns among unregistered users, such as geographic similarity, certain institutions (e.g. if edits are coming from a high school or university)
- The ability to target specific groups of unregistered users, such as vandals jumping IPs within a specific range
- Location or institution-specific actions (not necessarily blocks); for example, the ability to determine if edits are made from an open proxy, or public location like a school or public library.
Depending on how we handle temporary accounts or identifiers for unregistered users, we may be able to improve communication to unregistered users. Underlying discussions and concerns around unregistered editing, anonymous vandalism, and bias against unregistered users are unlikely to significantly change if we mask IPs, provided we maintain the ability to edit projects while logged out.
We interviewed CheckUsers on multiple projects throughout our process for designing the new Special:Investigate tool. Based on interviews and walkthroughs of real-life cases, we broke down the general CheckUser workflow into five sections:
- Triaging: assessing cases for feasibility and complexity.
- Profiling: creating a pattern of behaviour which will identify the user behind multiple accounts.
- Checking: examining IPs and useragents using the CheckUser tool.
- Judgement: matching this technical information against the behavioural information established in the Profiling step, in order to make a final decision about what kind of administrative action to take.
- Closing: reporting the outcome of the investigation on public and private platforms where necessary, and appropriately archiving information for future use.
We also worked with staff from Trust and Safety to get a sense for how the CheckUser tool factors into Wikimedia Foundation investigations and cases that are escalated to T&S.
The most common and obvious pain points all revolved around the CheckUser tool's unintuitive information presentation, and the need to open up every single link in a new tab. This caused massive confusion as tab proliferation quickly got out of hand. To make matters worse, the information that CheckUser surfaces is highly technical and not easy to understand at first glance, making the tabs difficult to track. All of our interviewees said that they resorted to separate software or physical pen and paper in order to keep track of information.
We also ran some basic analyses of English Wikipedia's Sockpuppet Investigations page to get some baseline metrics on how many cases they process, how many are rejected, and how many sockpuppets a given report contains.
Previous research on patrolling on our projects has generally focused on the workload or workflow of patrollers. Most recently, the Patrolling on Wikipedia study focuses on the workflows of patrollers and identifying potential threats to current anti-vandal practices. Older studies, such as the New Page Patrol survey and the Patroller work load study, focused on English Wikipedia. They also look solely at the workload of patrollers, and more specifically on how bot patrolling tools have affected patroller workloads.
Our study tried to recruit from five target wikis, which were
- Japanese Wikipedia
- Dutch Wikipedia
- German Wikipedia
- Chinese Wikipedia
- English Wikiquote
They were selected for known attitudes towards IP edits, percentage of monthly edits made by IPs, and any other unique or unusual circumstances faced by IP editors (namely, use of the Pending Changes feature and widespread use of proxies). Participants were recruited via open calls on Village Pumps or the local equivalent. Where possible, we also posted on Wiki Embassy pages. Unfortunately, while we had interpretation support for the interviews themselves, we did not extend translation support to the messages, which may have accounted for low response rates. All interviews were conducted via Zoom, with a note-taker in attendance.
Supporting the findings from previous studies, we did not find a systematic or unified use of IP information. Additionally, this information was only sought out after a certain threshold of suspicion. Most further investigation of suspicious user activity begins with publicly available on-wiki information, such as checking previous local edits, Global Contributions, or looking for previous bans.
Precision and accuracy were less important qualities for IP information: upon seeing that one chosen IP information site returned three different results for the geographical location of the same IP address, one of our interviewees mentioned that precision in location was not as important as consistency. That is to say, so long as an IP address was consistently exposed as being from one country, it mattered less if it was correct or precise. This fits with our understanding of how IP address information is used: as a semi-unique piece of information associated with a single device or person, that is relatively hard to spoof for the average person. The accuracy or precision of the information attached to the user is less important than the fact that it is attached and difficult to change.
Our findings highlight a few key design aspects for the IP info tool:
- Provide at-a-glance conclusions over raw data
- Cover key aspects of IP information:
- Geolocation (to a city or district level where possible)
- Registered organization
- Connection type (high-traffic, such as data center or mobile network versus low-traffic, such as residential broadband)
- Proxy status as binary yes or no
As an ethical point, it will be important to be able to explain how any conclusions are reached, and the inaccuracy or imprecisions inherent in pulling IP information. While this was not a major concern for the patrollers we talked to, if we are to create a tool that will be used to provide justifications for administrative action, we should be careful to make it clear what the limitations of our tools are.
Statements from the Wikimedia Foundation Legal department
Hello! Please review the new Access to temporary account IP addresses page for details about how users can gain access to IP addresses. The section on using IP addresses will be updated with details about how and where to access the IP addresses, as well as what is logged when IP addresses are accessed. Please also review a new related page with frequently asked questions. You will notice that both pages use the term "temporary user accounts," which comes from the MVP—more information about the MVP will be shared directly on this page soon. If you have questions or concerns, please reach out on the talk page.
First of all, we’d like to thank everyone for participating in these discussions. We appreciate the attention to detail, the careful consideration, and the time that has gone into engaging in this conversation, raising questions and concerns, and suggesting ways that the introduction of masked IPs can be successful. Today, we’d like to explain in a bit more detail how this project came about and the risks that inspired this work, answer some of the questions that have been raised so far, and briefly talk about next steps.
To explain how we arrived here, we’d like to briefly look backwards. Wikipedia and its sibling projects were built to last. Sharing the sum of all knowledge isn’t something that can be done in a year, or ten years, or any of our lifetimes. But while the mission of the communities and Foundation was created for the long term, the technical and governance structures that enable that mission were very much of the time they were designed. Many of these features have endured, and thrived, as the context in which they operate has changed. Over the last 20 years, a lot has evolved: the way societies use and relate to the internet, the regulations and policies that impact how online platforms run as well as the expectations that users have for how a website will handle their data.
The Foundation’s Privacy team is consistently monitoring this conversation, assessing our practices, and planning for the future. It is our job to look at the projects of today, and evaluate how we can help prepare them to operate within the legal and societal frameworks of tomorrow. A few years ago, as part of this work, we assessed that the current system of publishing IP addresses of non-logged-in contributors should change. We believe it creates risk to users whose information is published in this way. Many do not expect it—even with the notices explaining how attribution works on the projects, the Privacy team often hears from users who have made an edit and are surprised to see their IP address on the history page. Some of them are in locations where the projects are controversial, and they worry that the exposure of their IP address may allow their government to target them. The legal frameworks that we foresaw are in operation, and the publication of these IP addresses pose real risks to the projects and users today.
We’ve heard from several of you that you want to understand more deeply what the legal risks are that inspired this project, whether the Foundation is currently facing legal action, what consequences we think might result if we do not mask IP addresses, etc. (many of these questions have been collected in the expanded list at the end of this section). We’re sorry that we can’t provide more information, since we need to keep some details of the risks privileged. “Privileged” means that a lawyer must keep something confidential, because revealing it could cause harm to their client. That’s why privilege is rarely waived; it’s a formal concept in the legal systems of multiple countries, and it exists for very practical reasons—to protect the client. Here, waiving the privilege and revealing this information could harm the projects and the Foundation. Generally, the Legal Affairs team works to be as transparent as possible; however, an important part of our legal strategy is to approach each problem on a case by case basis. If we publicly discuss privileged information about what specific arguments might be made, or what risks we think are most likely to result in litigation, that could create a road map by which someone could seek to harm the projects and the communities.
That said, we have examined this risk from several angles, taking into account the legal and policy situation in various countries around the world, as well as concerns and oversight requests from users whose IP addresses have been published, and we concluded that IP addresses of non-logged-in users should no longer be publicly visible, largely because they can be associated with a single user or device, and therefore could be used to identify and locate non-logged-in users and link them with their on-wiki activity.
Despite these concerns, we also understood that IP addresses play a major part in the protection of the projects, allowing users to fight vandalism and abuse. We knew that this was a question we’d need to tackle holistically. That’s why a working group from different parts of the Wikimedia Foundation was assembled to examine this question and make a recommendation to senior leadership. When the decision was taken to proceed with IP masking, we all understood that we needed to do this with the communities—that only by taking your observations and ideas into account would we be able to successfully move through this transition.
I want to emphasize that even when IP addresses are masked and new tools are in place to support your anti-vandalism work, this project will not simply end. It’s going to be an iterative process—we will want feedback from you as to what works and what doesn’t, so that the new tools can be improved and adapted to fit your needs.
Over the past months, you’ve had questions, and often, we’ve been unable to provide the level of detail you’re hoping for in our answers, particularly around legal issues.
- Q: What specific legal risks are you worried about?
A: We cannot provide details about the individual legal risks that we are evaluating. We realize it’s frustrating to ask why and simply get, “that’s privileged” as an answer. And we’re sorry that we cannot provide more specifics, but as explained above, we do need to keep the details of our risk assessment, and the potential legal issues we see on the horizon, confidential, because providing those details could help someone figure out how to harm the projects, communities, and Foundation.
There are settled answers to some questions.
- Q: Is this project proceeding?
A: Yes, we are moving forward with finding and executing on the best way to hide IP addresses of non-logged-in contributors, while preserving the communities’ ability to protect the projects.
- Q: Can this change be rolled out differently by location?
A: No. We strive to protect the privacy of all users to the same standard; this will change across the Wikimedia projects.
- Q: If other information about non-logged-in contributors is revealed (such as location, or ISP), then it doesn’t matter if the IP address is also published, right?
A: That’s not quite the case. In the new system, the information we make available will be general information that is not linked to an individual person or device—for example, providing a city-level location, or noting that an edit was made by someone at a particular university. While this is still information about the user, it’s less specific and individual than an IP address. So even though we are making some information available in order to assist with abuse prevention, we are doing a better job of protecting the privacy of that specific contributor.
- Q: If we tell someone their IP address will be published, isn’t that enough?
A: No. As mentioned above, many people have been confused to see their IP address published. Additionally, even when someone does see the notice, the Foundation has legal responsibilities to properly handle their personal data. We have concluded that we should not publish the IP addresses of non-logged-in contributors because it falls short of current privacy best practices, and because of the risks it creates, including risks to those users.
- Q: How will masking impact CC-BY-SA attribution?
And sometimes, we don’t know the answer to a question yet, because we’d like to work with you to find the solution.
- Q: What should the specific qualifications be for someone to apply for this new user right?
A: There will be an age limit; we have not made a definitive decision about the limit yet, but it’s likely they will need to be at least 16 years old. Additionally, they should be active, established community members in good standing. We’d like to work through what that means with you.
- I see that the team preparing these changes is proposing to create a new userright for users to have access to the IP addresses behind a mask. Does Legal have an opinion on whether access to the full IP address associated with a particular username mask constitutes nonpublic personal information as defined by the Confidentiality agreement for nonpublic information, and will users seeking this new userright be required to sign the Access to nonpublic personal data policy or some version of it?
- 1 If yes, then will I as a checkuser be able to discuss relationships between registered accounts and their IP addresses with holders of this new userright, as I currently do with other signatories?
- 2 If no, then could someone try to explain why we are going to all this trouble for information that we don't consider nonpublic?
- 3 In either case, will a checkuser be permitted to disclose connections between registered accounts and unregistered username masks?
A: This is a great question. The answer is partially yes. First, yes, anyone who has access to the right will need to acknowledge in some way that they are accessing this information for the purposes of fighting vandalism and abuse on the projects. We are working on how this acknowledgement will be made; the process to gain access is likely to be something less complex than signing the access to non-public personal data agreement.
As to how this would impact CUs, right now, the access to non-public personal data policy allows users with access to non-public personal data to share that data with other users who are also able to view it. So a CU can share data with other CUs in order to carry out their work. Here, we are maintaining a distinction between logged-in and logged-out users, so a CU would not be able to share IP addresses of logged-in users with users who have this new right, because users with the new right would not have access to such information.
Presuming that the CU also opts in to see IP addresses of non-logged-in users, under the current scheme, that CU would be able to share IP address information demonstrating connections between logged-in users and non-logged-in users who had been masked with other CUs who had also opted in. They could also indicate to users with the new right that they detected connections between logged-in and non-logged-in users. However, the CU could not directly the share IP addresses of the logged-in users with non-CU users who only have the new right.
Please let us know if this sounds unworkable. As mentioned above, we are figuring out the details, and want to get your feedback to make sure it works.
Over the next few months, we will be rolling out more detailed plans and prototypes for the tools we are building or planning to build. We’ll want to get your feedback on these new tools that will help protect the projects. We’ll continue to try to answer your questions when we can, and seek your thoughts when we should arrive at the answer together. With your feedback, we can create a plan that will allow us to better protect non-logged-in editors’ personal data, while not sacrificing the protection of Wikimedia users or sites. We appreciate your ideas, your questions, and your engagement with this project.
This statement from the Wikimedia Foundation Legal department was written on request for the talk page and comes from that context. For visibility, we wanted you to be able to read it here too.
Hello All. This is a note from the Legal Affairs team. First, we’d like to thank everyone for their thoughtful comments. Please understand that sometimes, as lawyers, we can’t publicly share all of the details of our thinking; but we read your comments and perspectives, and they’re very helpful for us in advising the Foundation.
On some occasions, we need to keep specifics of our work or our advice to the organization confidential, due to the rules of legal ethics and legal privilege that control how lawyers must handle information about the work they do. We realize that our inability to spell out precisely what we’re thinking and why we might or might not do something can be frustrating in some instances, including this one. Although we can’t always disclose the details, we can confirm that our overall goals are to do the best we can to protect the projects and the communities at the same time as we ensure that the Foundation follows applicable law.
Within the Legal Affairs team, the privacy group focuses on ensuring that the Foundation-hosted sites and our data collection and handling practices are in line with relevant law, with our own privacy-related policies, and with our privacy values. We believe that individual privacy for contributors and readers is necessary to enable the creation, sharing, and consumption of free knowledge worldwide. As part of that work, we look first at applicable law, further informed by a mosaic of user questions, concerns, and requests, public policy concerns, organizational policies, and industry best practices to help steer privacy-related work at the Foundation. We take these inputs, and we design a legal strategy for the Foundation that guides our approach to privacy and related issues. In this particular case, careful consideration of these factors has led us to this effort to mask IPs of non-logged-in editors from exposure to all visitors to the Wikimedia projects. We can’t spell out the precise details of our deliberations, or the internal discussions and analyses that lay behind this decision, for the reasons discussed above regarding legal ethics and privilege.
We want to emphasize that the specifics of how we do this are flexible; we are looking for the best way to achieve this goal in line with supporting community needs. There are several potential options on the table, and we want to make sure that we find the implementation in partnership with you. We realize that you may have more questions, and we want to be clear upfront that in this dialogue we may not be able to answer the ones that have legal aspects. Thank you to everyone who has taken the time to consider this work and provide your opinions, concerns, and ideas.
Anti-Harassment Tools Team
Please use the talk page for discussions on the matter. For any issues concerning this release, please don't hesitate to contact Niharika Kohli, Product Manager – niharika wikimedia.org and cc Whatamidoing, Community Relations Specialist – whatamidoing wikimedia.org or leave a message on the talk page.
For more information or documentation on IP editing, masking and an overview of what has been done so far including community discussions, please see the links below.
About IP Editing · About IP Addresses · IP Editing Restriction Study · Impact report for Login Required Experiment on Portuguese Wikipedia · Research:Value of IP Editing