ویرایش با آی‌پی: بهبود حریم خصوصی و کاهش سوءاستفاده

This page is a translated version of the page IP Editing: Privacy Enhancement and Abuse Mitigation and the translation is 20% complete.

دید اجمالی

کاربران اینترنت در سال‌های اخیر نسبت به اهمیت درک ذخیره‌سازی و استفاده از داده‌های خصوصی خود آگاه‌تر شده‌اند. دولت‌ها در کشورهای مختلف قوانین جدیدی را به‌منظور حفاظت بهینه‌تر از حریم خصوصی کاربران تدوین کرده‌اند. تیم‌های حقوقی و عمومی بنیاد ویکی‌مدیا به‌طور مستمر تحولات در قوانین مختلف در سراسر جهان، چگونگی محافظت از حریم خصوصی کاربران به بهترین وجه، احترام به انتظارات کاربران و حمایت از ارزش‌های جنبش‌های ویکی‌مدیا را بررسی می‌کنند. با در نظر داشتن این پس‌زمینه، آن‌ها از ما خواسته‌اند تا پروژه‌ها را بررسی کرده و آن‌ها را از نظر فنی بهبود دهیم. ما باید این کار را با کمک شما انجام دهیم.

نرم‌افزار مدیاویکی، نشانی‌های آی‌پی مشارکت‌کنندگان ثبت نام نکرده را ذخیره و منتشر می‌کند (به‌عنوان بخشی از امضای آن‌ها، در تاریخچهٔ صفحه‌ها و سیاهه‌ها) و آن‌ها را به تمامی کسانی که از وبگاه‌های ما دیدن می‌کنند، نشان می‌دهد. انتشار این نشانی‌های آی‌پی، امنیت و ناشناس‌بودن این کاربران را به‌خطر می‌اندازد. در برخی موارد، چنین چیزی می‌تواند ریسک افتادن در دام آزار و اذیت از سوی حکومت‌ها را برای افراد به‌دنبال داشته‌باشد. هرچند که ما به کاربران اعلام می‌کنیم که نشانی آی‌پی آن‌ها قابل مشاهده خواهد بود، اما کمتر کسی انشعابات این اطلاعات را درک می‌کند. ما در حال کار بر روی افزایش سطح محافظت از حریم خصوصی برای مشارکت‌کنندگان ثبت‌نام نکرده به‌واسطهٔ مخفی کردن نشانی آی‌پی آن‌ها در زمان مشارکت در پروژه‌ها، وابسته به چگونگی عدم مشاهدهٔ نشانی آی‌پی یک کاربر ثبت‌نام کرده توسط یک مخاطب معمولی هستیم. این شامل ایجاد یک نام کاربری «آی‌پی پوشانده‌شده» می‌شود که به‌طور خودکار تولید شده، اما انسان-خواندنی خواهد بود. ما ایده‌های مختلفی برای چگونگی پیاده‌سازی این قابلیت داریم. می‌توانید نظر دهید و ما را از آنچه بدان نیاز دارید، آگاه کنید.

پروژه‌های ویکی‌مدیا دارای دلیل قانع‌کننده‌ای برای دسته‌بندی و انتشار نشانی‌های آی‌پی دارند: این نشانی‌ها نقض مهمی در دور نگه‌داشتن خرابکاری و آزار و اذیت از ویکی‌های ما دارند. بسیار مهم است که ابزارهایی با قابلیت شناسایی و قطع دسترسی خرابکاران، حساب‌های زاپاس، ویرایشگران دارای تعارض منافع و سایر کنش‌گران با نیت بد در دسترس گشت‌زنان، مدیران و کارگذاران باشند. ما می‌خواهیم با کمک شما راه‌هایی برای حفاظت از حریم خصوصی کاربران خود بیابیم و همزمان عملکرد ابزارهای مبارزه با خرابکاری خود را هم‌تراز با عملکرد کنونی آن‌ها حفظ کنیم. هرگاه این کار انجام شود، قصد داریم نشانی‌های آی‌پی را در ویکی‌های خود به حالت پوشیده درآوریم – از جمله محدود کردن تعداد افرادی که می‌توانند نشانی آی‌پی سایر کاربران را ببینند و کاهش میزان مدت‌زمان نگه‌داشتن نشانی‌های آی‌پی در پایگاه‌های دادهٔ خود و سیاهه‌ها. مهم است که دقت کنید بخش مهمی از این کار مربوط به حصول اطمینان از تداوم دسترسی به همان سطح از ابزارهای مبارزه با خرابکاری – یا بهتر – برای ویکی‌های ما و نبود ریسک مواجهه با خرابکاری است.

هدف بنیاد ویکی‌مدیا، ایجاد مجموعه‌ای از ابزارهای مدیریتی است که نیاز همگان به دسترسی مستقیم به نشانی‌های آی‌پی را از بین ببرند. با ایجاد این تحول در ابزارهای مدیریتی، ما قادر خواهیم بود تا نشانی آی‌پی کاربران ثبت‌نام نکرده را بپوشانیم. ما کاملاً آگاه هستیم که این تغییر بر روی روال‌های کاری کنونی اثر خواهد گذاشت و می‌خواهیم مطمئن شویم که ابزارهای جدید، امکان مدیریت مؤثر، حفاظت از پروژه‌ها در مقابل خرابکاری و حمایت از نظارت از سوی اجتماع را فراهم می‌کنند.

رسیدن به مرحلهٔ تصمیم‌گیری برای ما تنها به‌واسطهٔ همکاری با بازرسان کاربر، ویکی‌بدها، مدیران و سایر کاربرانی که با خرابکاری مبارزه می‌کنند، ممکن است.

این مشکل بسیار چالش‌برانگیز است و در صورت شکست‌خوردن، توانایی ما در محافظت از ویکی‌هایمان را به خطر می‌اندازد؛ به همین دلیل این برنامه در چند سال گذشته کنار گذاشته شده‌بود. اما با توجه به تحول‌ها و پیشرفت‌های اخیر در استانداردهای حفظ حریم خصوصی داده‌ها در اینترنت، قوانین جدید و تغییر انتظارات کاربران، بنیان ویکی‌مدیا معتقد است که اکنون زمان رسیدگی به این مشکل فرار رسیده است.

به‌روزرسانی‌ها

30 August 2021

Hello. This is a brief update about Portuguese Wikipedia’s metrics since they started requiring registration to edit. We have a comprehensive report on the Impact report page. This report includes metrics captured through data as well as a survey that was conducted among active Portuguese Wikipedia contributors. All in all, the report presents the change in a positive light. We have not seen any significant disruption over the time period these metrics have been captured. In light of this, we are now encouraged to run an experiment on two more projects to see if we observe similar impact. All projects are unique in their own ways and what holds true for Portuguese Wikipedia might not hold true for another project. We want to run a limited-time experiment on two projects where registration will be required in order to edit. We estimate that it will take approximately 8 months for us to collect enough data to see significant changes. After that time period, we will return to not requiring registration to edit while we analyse the data. Once the data is published, the community will be able to decide for themselves whether or not they want to continue to disallow unregistered editing on the project.

We are calling this the Login Required Experiment. You will find more detail as well as a timeline on that page. Please use that page and its talk page to discuss this further.

۱۰ ژوئن ۲۰۲۱

سلام بر همگی. از آخرین بروزرسانی برای این پروژه چند ماه می‌گذرد. ما در این مدت با افراد زیادی — در میان اجتماع ویرایشگران و در داخل بنیاد — گفتگو کرده‌ایم. ما در خصوص ارزیابی همهٔ نگرانی‌های مطرح‌شده در گفتگوهای خود با اعضای باتجربهٔ جامعه در مورد تأثیری که این امر بر تلاش‌ها جهت مبارزه با خرابکاری در سرتاسر پروژه‌های ما ایجاد می کند، دقت ویژه‌ای داشته‌ایم. همچنین حرف‌های تعداد قابل توجهی از افرادی که حامی این پیشنهاد به‌عنوان گامی به سوی بهبود حریم خصوصی ویرایشگران ثبت‌نام‌نکرده و کاهش تهدیدهای حقوقی که به‌واسطهٔ فاش‌سازی نشانی‌های آی‌پی برای همگان، متوجه پروژه‌های ما می‌شود، هستند ار نیز شنیده‌ایم.

زمانی که در گذشته دربارهٔ این پروژه گفتگو کردیم، ایدهٔ واضحی برای شکل‌دهی به این پروژه نداشتیم. قصد ما، درک چگونگی مفید بودن نشانی‌های آی‌پی برای اجتماعاتمان بوده‌است. ما از آن زمان تاکنون بازخوردهای زیادی در این زمینه از تعدادی از گفتگوها در زبان‌های مختلف و اجتماع‌های مختلف دریافت کرده‌ایم. ما از تمامی اعضای احتماع که زمان خود را صرف آگاه‌سازی ما دربارهٔ چگونگی انجام کارهای گرداندن ویکی‌های خود یا محیط میان‌ویکیایی خاص خود کرده‌اند، بسیار سپاسگزاریم.

پیشنهاد به‌اشتراک‌گذاری نشانی‌هایی آی‌پی با کسانی که به این دسترسی نیاز دارند

ما اکنون پیشنهادی مشخص‌تر برای این پروژه داریم که امیدواریم امکان انجام بی‌درنگ بیشتر کارهای مبارزه با خرابکاری را فراهم کند و در عین حال دسترسی به نشانی‌های آی‌پی را برای افرادی که نیازی به دیدن آن‌ها ندارند، محدود کنیم. «قصد دارم روی واژهٔ «پیشنهاد» تأکید کنم؛ چرا که به‌هیچ شکل، نحو یا گونه‌ای نشان‌دهندهٔ آنچه اتفاق خواهد افتاد، نیست. قصد ما این است که بازخورد شما پیرامون این ایده را بدانیم – فکر می‌کنید چه‌چیزی کارساز خواهد بود؟ معتقید چه‌چیزی کارساز نخواهد بود؟ چه ایده‌هایی می‌توانند بهبودش دهند؟» ما این ایده‌ها را در پی گفتگوهای متعدد با اعضای باتجربهٔ اجتماع توسعه داده‌ایم و آن‌ها را با همکاری بخش حقوقی بنیاد اصلاح کرده‌ایم. طرح کلی به‌شرح زیر است:

  • بازرسان کاربر، ویکی‌بدها و مدیران باید بتوانند به‌واسطهٔ تعهد به عدم به‌اشتراک‌گذاری نشانی‌های آی‌پی با سایر افرادی که به این اطلاعات دسترسی ندارند، این نشانی‌ها را به‌طور کامل مشاهده کنند.
  • ویرایشگرانی که در فعالیت‌های ضد خرابکاری مشارکت دارند، طبق آنچه مورد توافق اجتماع است، می‌توانند برای ادامه‌دادن به کار خود، دسترسی مشاهدهٔ نشانی‌های آی‌پی را کسب کنند. این کار می‌تواند به طریقی مشابه کسب دسترسی مدیریت در پروژه‌های ما انجام پذیرد. به‌منظور حصول اطمینان از این که تنها ویرایشگرانی که نیاز مبرم به این دسترسی دارند، به آن دست می‌یابند، تأیید اجتماع برای این کار بسیار مهم است. ویرایشگران برای کسب این دسترسی، ملزم به داشتن حساب کاربری با حداقل عمر یک سال و داشتن دست کم ۵۰۰ ویرایش هستند.
  • تمامی کاربرانی که عمر حساب آنان بیش از یک سال باشد و دست کم ۵۰۰ ویرایش داشته‌باشند، قادر خواهند بود بدون کسب دسترسی، بخشی از نشانی‌های آی‌پی را مشاهده کنند. این بدین معنی است که بخش‌های پایانی نشانی آی‌پی مخفی خواهد بود. این قابلیت به‌واسطهٔ ترجیحاتی در دسترس خواهد بود که بر مبنای آن، کاربر متعهد می‌شود تا نشانی آی‌پی را با سایر افرادی که به این اطلاعات دسترسی ندارند، به اشتراک نگذارد.
  • سایر کاربران قادر به مشاهدهٔ نشانی آی‌پی کاربران ثبت‌نام‌نکرده نخواهند بود.

دسترسی به نشانی آی‌پی در سیاهه ثبت خواهد شد تا در صورت نیاز، امکان بررسی دقیق میسر باشد. این مشابه سیاهه‌ای است که از دسترسی بازرسان به اطلاعات خصوصی جمع‌آوری می‌شود. اینگونه است که امیدواریم بتوانیم تعادلی میان نیاز به حفظ حریم خصوصی و نیازهای اجتماع به دسترسی به اطلاعات جهت مبارزه با هرزنگاری، خرابکاری و آزار و اذیت برقرار کنیم. ما می‌خواهیم اطلاعات را در اختیار کسانی قرار دهیم که به آن نیاز دارند؛ اما نیاز به یک فرآیند داریم و لازم است که تنها افرادی منتخب با نیاز مبرم قادر به مشاهدهٔ نشانی‌های آی‌پی باشند و نیاز داریم که دسترسی به این اطلاعات در یک سیاهه ثبت شود.

ما مشتاق هستیم که نظرات شما در مورد این روش پیشنهادی را بدانیم. لطفاً بازخوردهای خود را در صفحهٔ بحث با ما در میان بگذارید.

  • به‌نظر شما چه‌چیزی کارساز خواهد بود؟
  • به‌نظر شما چه‌چیزی کارساز نخواهد بود؟
  • چه ایده‌هایی باعث بهبود این ایده می‌شوند؟

به‌روزرسانی پیرامون توسعهٔ ابزارها

همانطور که احتمالاً می‌دانید، ما در حال کار بر روی ساخت چند ابزار جدید هستیم که بخشی از آن برای کاهش تأثیر پوشاندن آی‌پی و نیز تنها برای ساخت ابزارهای بهتر برای مبارزه با خرابکاری برای همگان است. این بر کسی پوشیده نیست که وضعیت ابزارهای مدیریتی در پروژه‌های ما، ابزارهایی که اجتماع‌ها شایستهٔ آن هستند را در اختیارشان قرار نمی‌دهد. زمینه‌های زیادی برای پیشرفت و بهبود وجود دارد. ما می‌خواهیم ابزارهایی بسازیم که انجام کار مؤثر و بهینه را برای مبارزه‌کنندگان با خرابکاری آسان‌تر کند. ما همچنین می‌خواهیم موانعی که در برابر مشارکت‌کنندگانی با مهارت‌های فنی کمتر برای پیوستن به این گروه‌ها موجود است را کاهش دهیم.

ما در گذشته پیرامون ایده‌هایی برای این ابزارها گفتگو کرده‌ایم که شرحی مختصر از آن‌ها را در پایین ذکر می‌کنم. دقت کنید که با توجه به این که تیم ما در حال کار بر روی تغییر اساسی SecurePoll جهت برآورده‌کردن نیازهای انتخابات پیش‌روی هیئت مدیرهٔ بنیاد بوده‌است، روند پیشرفت این ابزارها در چند ماه گذشته کند بوده‌است.

ویژگی اطلاعات آی‌پی

 
ماکت اطلاعات آی‌پی

ما در حال ساخت ابزاری هستیم که اطلاعات مهم دربارهٔ یک نشانی آی‌پی که معمولاً در زمان تحقیقات مورد استفاده قرار می‌گیرند را نمایش خواهد داد. معمولاً گشت‌زنان، مدیران و بازرسان کاربر برای دریافت این اطلاعات بر وبگاه‌های خارجی متکی هستند. ما امیدواریم که به‌واسطهٔ ادغام اطلاعات از عرضه‌کنندگان آی‌پی معتبر در وبگاه‌های خود، بتوانیم این فرآیند را به فرآیندی آسان‌تر تبدیل کنیم. ما اخیراً پیش‌نمونه‌ای ساخته‌ایم و دوره‌ای از آزمون‌ها توسط کاربران را به‌منظور تأیید رویکرد خود به اجرا درآورده‌ایم. در این میان دریافته‌ایم که بخش عمده‌ای از ویرایشگران در گروه مصاحبه‌شوندگان این ابزار را مفید دانسته و اشاره کرده‌اند که مایل هستند در آینده از آن استفاده کنند. یک بروزرسانی در صفحهٔ پروژه موجود است که مایلم توجه شما را به آن جلب کنم. سؤالات کلیدی که ما می‌خواهیم بازخورد شما دربارهٔ آن‌ها را در صفحهٔ بحث پروژه بدانیم به شرح زیر است:

  • در زمان تحقیق دربارهٔ یک نشانی آی‌پی، در جستجوd کدام انواع از اطلاعات هستید؟ در زمان جستجو برای این اطلاعات، احتمال دارد به کدام صفحه رجوع کنید؟
  • کدام دسته از اطلاعات دربارهٔ آی‌پی‌ها را مفید می‌دانید؟
  • فکر می‌کنید کدام انواع از اطلاعات آی‌پی‌ها اگر منتشر شوند، ویرایشگران ناشناس ما را تحت خطر قرار می‌دهند؟

ویژگی تطابق ویرایشگران

در گفتگوهای اولیه، به این پروژه با نام «ویرایشگران نزدیک» و «شناسایی زاپاس» نیز اشاره شده‌است. ما سعی داریم نامی مناسب برای این ویژگی بیابیم که حتی برای افرادی که درکی از واژهٔ زاپاس ندارند نیز قابل درک باشد. ما در مراحل اولیهٔ این پروژه هستیم. واحد پژوهش بنیاد ویکی‌مدیا پروژه‌ای دارد که می‌تواند در شناسایی شباهت رفتار ویرایشی بین دو ویرایشگر کمک‌کننده باشد. این به کشف ارتباط کاربران ثبت‌نام‌نکردهٔ مختلف، در زمانی که تحت نام‌های کاربری تولیدشده به‌صورت خودکار به ویرایش می‌پردازند، کمک می‌کند. وقتی سال پیش صحبت‌ها دربارهٔ این پروژه را آغاز کردیم، با حمایت‌های زیادی روبرو شدیم. علاوه بر این، چیزهای زیادی هم دربارهٔ ریسک‌های توسعهٔ چنین ویژگی‌ای شنیدیم. ما در حال برنامه‌ریزی برای ساخت یک پیش‌نمونه در آیندهٔ نزدیک، و به اشتراک گذاشتن آن با اجتماع هستیم. یک صفحهٔ پروژه برای این پروژه موجود است. امیدواریم که به‌زودی بتوانیم یک بروزرسانی برای آن منتشر کنیم. از دیدگاه‌های شما دربارهٔ این پروژه در صفحهٔ بحث پروژه بسیار استقبال می‌کنیم.

داده‌های ویکی‌پدیای پرتغالی پیرامون غیرفعال کردن ویرایش آی‌پی‌ها

ویکی‌پدیای پرتغالی یک سال پیش ویرایش آی‌پی‌ها در پروژه را ممنوع کرد. در چند ماه گذشته، تیم ما در حال جمع‌آوری داده‌ها دربارهٔ عواقب این حرکت برای سلامت عمومی پروژه بوده‌است. ما همچنین با چندین عضو از پروژه دربارهٔ تجربهٔ آن‌ها گفتگو کرده‌ایم. ما در حال کار بر روی قدم‌های پایانی برای گردآوری همهٔ داده‌هایی هستیم که تصویری دقیق از وضعیت پروژه را ارائه می‌دهند. امیدواریم که اخبار جدیدی پیرامون این موضوع را در آینده منتشر کنیم.

بروزرسانی‌های پیشین

۳۰ اکتبر ۲۰۲۰

ما بخش سؤالات متداول را با پرسش‌هایی که در صفحهٔ بحث مطرح شده‌بودند، به‌روز کرده‌ایم. بخش حقوقی بنیاد ویکی‌مدیا بیانیه‌ای را طبق درخواست مطرح‌شده، به گفتگوی در جریان در صفحهٔ بحث افزوده‌است و ما آن را در صفحهٔ اصلی نیز قرار داده‌ایم. در صفحهٔ بحث، سعی کرده‌ایم تا تفکر خود پیرامون اعطای دسترسی به داده‌های مورد نیاز به مبارزه‌کنندگان، با خرابکاری بدون نیاز به این که مدیر یا بازرس باشند را به‌طور حدودی بیان کنیم.

۱۵ اکتبر ۲۰۲۰

محتوای این صفحه به‌طور عمده منقضی شده‌است و ما تصمیم گرفته‌ایم که بخش‌هایی از آن را به‌گونه‌ای بازنویسی کنیم که بازتاب‌دهندهٔ جایگاه ما در فرآیند پروژه باشد. این صفحه قبلاً به این شکل بود. ما این صفحه را با آخرین اطلاعات پیرامون ابزارهایی که در حال کار بر روی آن هستیم، پژوهش‌ها و انگیزه‌های کامل به‌روز کرده‌ایم و چیزهایی را نیز به پرسش‌های متداول افزوده‌ایم. مرتبط‌ترین‌ها احتمالاً شامل کارهای ما بر روی ویژگی اطلاعات آی‌پی، ابزار جدید بازرسی که اکنون بر روی چهار ویکی فعال شده‌است و پژوهش‌های ما با موضوع بهترین راه برای رسیدگی به شناسایی آی‌پی هستند: ما را از آنچه به آن نیاز دارید، مشکلات احتمالی که می‌بینید مطلع کنید و اگر ترکیبی از آی‌پی و کوکی‌ها می‌تواند برای محیط کار شما مفید باشد، به ما اطلاع دهید.

Tools

Like mentioned previously, our foremost goal is to provide better anti-vandalism tools for our communities which will provide a better moderation experience for our vandal fighters while also working towards making the IP address string less valuable for them. Another important reason to do this is that IP addresses are hard to understand and are really very useful only to tech-savvy users. This creates a barrier for new users without any technical background to enter into functionary roles as there is a higher learning curve for them to work with IP addresses. We hope to get to a place where we can have moderation tools that anyone can use without much prior knowledge.

The first thing we decided to focus on was to make the CheckUser tool more flexible, powerful and easy to use. It is an important tool that services the need to detect and block bad actors (especially long-term abusers) on a lot of our projects. The CheckUser tool was not very well maintained for many years and as a result it appeared quite dated and lacked necessary features.

We also anticipated an uptick in the number of users who opt-in to the role of becoming a CheckUser on our projects once IP Masking goes into effect. This reinforced the need for a better, easier CheckUser experience for our users. With that in mind, the Anti-Harassment Tools team spent the past year working on improving the CheckUser tool – making it much more efficient and user-friendly. This work has also taken into account a lot of outstanding feature requests by the community. We have continually consulted with CheckUsers and stewards over the course of this project and have tried our best to deliver on their expectations. The new feature is set to go live on all projects in October 2020.

The next feature that we are working on is IP info. We decided on this project after a round of consultation on six wikis which helped us narrow down the use cases for IP addresses on our projects. It became apparent early on that there are some critical pieces of information that IP addresses provide which need to be made available for patrollers to be able to do their roles effectively. The goal for IP Info, thus, is to quickly and easily surface significant information about an IP address. IP addresses provide important information such as location, organization, possibility of being a Tor/VPN node, rDNS, listed range, to mention a few examples. By being able to show this, quickly and easily without the need for external tools everyone can’t use, we hope to be able to make it easier for patrollers to do their job. The information provided is high-level enough that we can show it without endangering the anonymous user. At the same time, it is enough information for patrollers to be able to make quality judgements about an IP address.

After IP Info we will be focusing on a finding similar editors feature. We’ll be using a machine learning model, built in collaboration with CheckUsers and trained on historical CheckUser data to compare user behavior and flag when two or more users appear to be behaving very similarly. The model will take into account which pages users are active on, their writing styles, editing times etc to make predictions about how similar two users are. We are doing our due diligence in making sure the model is as accurate as possible.

Once it’s ready, there is a lot of scope for what such a model can do. As a first step we will be launching it to help CheckUsers detect socks easily without having to perform a lot of manual labor. In the future, we can think about how we can expose this tool to more people and apply it to detect malicious sockpuppeting rings and disinformation campaigns.

You can read more and leave comments on our project page for tools.

Motivation

We who are working on this are doing this because the legal and public policy teams advised us that we should evolve the projects’ handling of IP addresses in order to keep up with current privacy standards, laws, and user expectations. That’s really the main reason.

We also think there are other compelling reasons to work on this. If someone wants to help out and don’t understand the ramifications of their IP address being publicly stored, their desire to make the world and the wiki a better place results in inadvertently sharing their personal data with the public. This is not a new discussion: we’ve had it for about as long as the Wikimedia wikis have been around. An IP address can be used to find out a user’s geographical location and institution and other personally identifiable information, depending on how the IP address was assigned and by whom. This can sometimes mean that an IP address can be used to pinpoint exactly who made an edit and from where, especially when the editor pool is small in a geographic area. Concerns around exposing IP addresses on our projects have been brought repeatedly by our communities and the Wikimedia movement as a whole has been talking about how to solve this problem for at least fifteen years. Here’s a (non-exhaustive) list of some of the previous discussions that have happened around this topic.

We acknowledge that this is a thorny issue, with the potential for causing disruptions in workflows we greatly respect and really don’t want to disrupt. We would only undertake this work, and spend so much time and energy on it, for very good reason. These are important issues independently, and together they have inspired this project: there’s both our own need and desire to protect those who want to contribute to the wikis, and developments in the world we live in, and the online environment in which the projects exist.

Research

 
A Wikimedia Foundation-supported report on the impact that IP masking will have on our community.

IP masking impact

IP addresses are valuable as a semi-reliable partial identifier, which is not easily manipulated by their associated user. Depending on provider and device configuration, IP address information is not always accurate or precise, and deep technical knowledge and fluency is needed to make best use of IP address information, though administrators are not currently required to demonstrate such fluency to have access. This technical information is used to support additional information (referred to as “behavioural knowledge”) where possible, and the information taken from IP addresses significantly impact the course of administrative action taken.

On the social side, the issue of whether to allow unregistered users to edit has been a subject of extensive debate. So far, it has erred on the side of allowing unregistered users to edit. The debate is generally framed around a desire to halt vandalism, versus preserving the ability for pseudo-anonymous editing and lowering the barrier to edit. There is a perception of bias against unregistered users because of their association with vandalism, which also appears as algorithmic bias in tools such as ORES. Additionally, there are major communications issues when trying to talk to unregistered users, largely due to lack of notifications, and because there is no guarantee that the same person will be reading the messages sent to that IP talk page.

In terms of the potential impact of IP masking, it will significantly impact administrator workflows and may increase the burden on CheckUsers in the short term. If or when IP addresses are masked, we should expect our administrators' ability to manage vandalism to be greatly hindered. This can be mitigated by providing tools with equivalent or greater functionality, but we should expect a transitional period marked by reduced administrator efficacy. In order to provide proper tool support for our administrators’ work, we must be careful to preserve or provide alternatives to the following functions currently fulfilled by IP information:

  • Block efficacy and collateral estimation
  • Some way of surfacing similarities or patterns among unregistered users, such as geographic similarity, certain institutions (e.g. if edits are coming from a high school or university)
  • The ability to target specific groups of unregistered users, such as vandals jumping IPs within a specific range
  • Location or institution-specific actions (not necessarily blocks); for example, the ability to determine if edits are made from an open proxy, or public location like a school or public library.

Depending on how we handle temporary accounts or identifiers for unregistered users, we may be able to improve communication to unregistered users. Underlying discussions and concerns around unregistered editing, anonymous vandalism, and bias against unregistered users are unlikely to significantly change if we mask IPs, provided we maintain the ability to edit projects while logged out.

CheckUser workflow

We interviewed CheckUsers on multiple projects throughout our process for designing the new Special:Investigate tool. Based on interviews and walkthroughs of real-life cases, we broke down the general CheckUser workflow into five sections:

  • Triaging: assessing cases for feasibility and complexity.
  • Profiling: creating a pattern of behaviour which will identify the user behind multiple accounts.
  • Checking: examining IPs and useragents using the CheckUser tool.
  • Judgement: matching this technical information against the behavioural information established in the Profiling step, in order to make a final decision about what kind of administrative action to take.
  • Closing: reporting the outcome of the investigation on public and private platforms where necessary, and appropriately archiving information for future use.

We also worked with staff from Trust and Safety to get a sense for how the CheckUser tool factors into Wikimedia Foundation investigations and cases that are escalated to T&S.

The most common and obvious pain points all revolved around the CheckUser tool's unintuitive information presentation, and the need to open up every single link in a new tab. This cause massive confusion as tab proliferation quickly got out of hand. To make matters worse, the information that CheckUser surfaces is highly technical and not easy to understand at first glance, making the tabs difficult to track. All of our interviewees said that they resorted to separate software or physical pen and paper in order to keep track of information.

We also ran some basic analyses of English Wikipedia's Sockpuppet Investigations page to get some baseline metrics on how many cases they process, how many are rejected, and how many sockpuppets a given report contains.

Patroller use of IP addresses

Previous research on patrolling on our projects has generally focused on the workload or workflow of patrollers. Most recently, the Patrolling on Wikipedia study focuses on the workflows of patrollers and identifying potential threats to current anti-vandal practices. Older studies, such as the New Page Patrol survey and the Patroller work load study, focused on English Wikipedia. They also look solely at the workload of patrollers, and more specifically on how bot patrolling tools have affected patroller workloads.

Our study tried to recruit from five target wikis, which were

  • Japanese Wikipedia
  • Dutch Wikipedia
  • German Wikipedia
  • Chinese Wikipedia
  • English Wikiquote

They were selected for known attitudes towards IP edits, percentage of monthly edits made by IPs, and any other unique or unusual circumstances faced by IP editors (namely, use of the Pending Changes feature and widespread use of proxies). Participants were recruited via open calls on Village Pumps or the local equivalent. Where possible, we also posted on Wiki Embassy pages. Unfortunately, while we had interpretation support for the interviews themselves, we did not extend translation support to the messages, which may have accounted for low response rates. All interviews were conducted via Zoom, with a note-taker in attendance.

Supporting the findings from previous studies, we did not find a systematic or unified use of IP information. Additionally, this information was only sought out after a certain threshold of suspicion. Most further investigation of suspicious user activity begins with publicly available on-wiki information, such as checking previous local edits, Global Contributions, or looking for previous bans.

Precision and accuracy were less important qualities for IP information: upon seeing that one chosen IP information site returned three different results for the geographical location of the same IP address, one of our interviewees mentioned that precision in location was not as important as consistency. That is to say, so long as an IP address was consistently exposed as being from one country, it mattered less if it was correct or precise. This fits with our understanding of how IP address information is used: as a semi-unique piece of information associated with a single device or person, that is relatively hard to spoof for the average person. The accuracy or precision of the information attached to the user is less important than the fact that it is attached and difficult to change.

Our findings highlight a few key design aspects for the IP info tool:

  • Provide at-a-glance conclusions over raw data
  • Cover key aspects of IP information:
    • Geolocation (to a city or district level where possible)
    • Registered organization
    • Connection type (high-traffic, such as data center or mobile network versus low-traffic, such as residential broadband)
    • Proxy status as binary yes or no

As an ethical point, it will be important to be able to explain how any conclusions are reached, and the inaccuracy or imprecisions inherent in pulling IP information. While this was not a major concern for the patrollers we talked to, if we are to create a tool that will be used to provide justifications for administrative action, we should be careful to make it clear what the limitations of our tools are.

FAQ

Q: Will users with advanced permissions such as CheckUsers, Admins, Stewards still have access to IP addresses after this project is complete?

A: We don’t yet have a definitive answer to this question. Ideally, IP addresses should be exposed to as few people as possible (including WMF staff). We hope to restrict IP address exposure to only those users who need to see it.

Q: How would anti-vandalism tools work without IP addresses?

A: There are some potential ideas for achieving this goal. For one, we may be able to surface other pertinent information about the user instead of the IP to the functionaries that provide the same amount of information. In addition, it may be possible to automatically verify if two separate user accounts link to the same IP, without exposing the IP – in cases of sockpuppet investigations. It’s also possible that anti-vandalism tools will continue to use IP addresses, but will have restricted access. We will need to work closely with the community to find the optimal solutions.

Q: If we don’t see IP addresses, what would we see instead when edits are made by unregistered users?

A: Instead of IP addresses, users will be able to see a unique, automatically-generated, human-readable username. This can look something like “Anonymous 12345”, for example.

Q: Will a new username be generated for every unregistered edit?

A: No. We intend to implement some method to make the generated usernames at least partially persistent, for example, by associating them with a cookie, the user’s IP address, or both.

Q: Will you also be removing existing IP addresses from the wikis as part of this project?

A: We will not be hiding any existing IP addresses in history, logs or signatures for this project. It will only affect future edits made after this project has been launched.

Q: Is this project the result of a particular law being passed?

A: No. Data privacy standards are evolving in many countries and regions around the world, along with user expectations. We have always worked hard to protect user privacy, and we continue to learn from and apply best practices based on new standards and expectations. This project is the next step in our own evolution.

Q: What is the timeline on this project?

A: As mentioned above, we will not be making any firm decisions about this project until we have gathered input from the communities. We'd like to figure out sensible early steps that a development team could work on soon, so we can get started on what we think will be a long project, but we're not hurrying to meet a particular deadline.

Q: How do I get involved?

A: We would love to hear if you have ideas or feedback about the project! We would especially like to hear if you have any workflows or processes that might be impacted by this project. You can drop your thoughts on the talk page or fill out this form and we’ll reach out to you. Some of us will be at Wikimania and would love to meet you there as well.

Q: Why is this proposal so unclear?

A: It’s not really a proposal and shouldn’t have been described it as such. We don’t have a solution, but are trying to work out the best solutions with the communities. It might be helpful to understand this as a technical investigation trying to figure out how IP masking could work.

Q: Why don’t you just turn off the ability to edit without registering an account?

A: Unregistered editing works differently across different Wikimedia wikis. For example, Swedish Wikipedia has discussed unregistered editing in the light of this investigation and decided they still want unregistered editing. Japanese Wikipedia has a far higher percentage of IP editing than English Wikipedia, but the revert rate of those edits are only a third – 9.5% compared to 27.4% – indicating that they are also far more useful. We think that deciding for all wikis that they can’t have IP editing is a destructive solution. The research done on IP editing also indicates IP editing is important for editor recruitment.

Q: Who will have access to IPs of unregistered users now?

A: We are not going to leave this burden to e.g. the CheckUsers and the stewards alone. We will have a new user right or the ability to opt in to see the IP if you fulfill certain requirements. Others could potentially see partial IP addresses. We are still talking to the communities about how this could best work.

Q: Has this been decided?

A: Yes. The Wikimedia Foundation’s Legal department has stated that this is necessary. As the entity legally responsible for protecting the privacy of Wikimedia users, the Wikimedia Foundation has accepted this advice and is now working to find the best way to implement this while supporting and listening to the user communities. Some Wikimedians will be unhappy with, this, but legal decisions like these have not been a matter of community consensus. What the communities can be part of deciding is how we do this. That very much needs to be defined with the Wikimedia communities.

Q. Will masks be global, for all Wikimedia wikis, or local, for one wiki?

A: Global. A masked IP will look the same across all Wikimedia wikis.

Q: Will all unregistered users be unblocked when this happens? If not you could track the information in the logs.

A: No. This would wreak havoc on the wikis. This solution will have to be a compromise. We have to balance the privacy of our unregistered editors with our ability to protect the wikis.

Q: Will those who have access to the IPs of unregistered users be able to unmask more than one IP in one action?

A: Yes. We don't want this to be a tedious and time-consuming task, when necessary. We will include this in our proposal.

بیانیهٔ بخش حقوقی بنیاد ویکی‌مدیا

July 2021

First of all, we’d like to thank everyone for participating in these discussions. We appreciate the attention to detail, the careful consideration, and the time that has gone into engaging in this conversation, raising questions and concerns, and suggesting ways that the introduction of masked IPs can be successful. Today, we’d like to explain in a bit more detail how this project came about and the risks that inspired this work, answer some of the questions that have been raised so far, and briefly talk about next steps.

Background

To explain how we arrived here, we’d like to briefly look backwards. Wikipedia and its sibling projects were built to last. Sharing the sum of all knowledge isn’t something that can be done in a year, or ten years, or any of our lifetimes. But while the mission of the communities and Foundation was created for the long term, the technical and governance structures that enable that mission were very much of the time they were designed. Many of these features have endured, and thrived, as the context in which they operate has changed. Over the last 20 years, a lot has evolved: the way societies use and relate to the internet, the regulations and policies that impact how online platforms run as well as the expectations that users have for how a website will handle their data.

In the past five years in particular, users and governments have become more and more concerned about online privacy and the collection, storage, handling, and sharing of personal data. In many ways, the projects were ahead of the rest of the internet: privacy and anonymity are key to users’ ability to share and consume free knowledge. The Foundation has long collected little information about users, not required an email address for registration, and recognized that IP addresses are personal data (see, for example, the 2014–2018 version of our Privacy policy). More recently, the conversation about privacy has begun to shift, inspiring new laws and best practices: the European Union’s General Data Protection Regulation, which went into effect in May 2018, has set the tone for a global dialogue about personal data and what rights individuals should have to understand and control its use. In the last few years, data protection laws around the world have been changing—look at the range of conversations, draft bills, and new laws in, for example, Brazil, India, Japan, or the United States.

Legal risks

The Foundation’s Privacy team is consistently monitoring this conversation, assessing our practices, and planning for the future. It is our job to look at the projects of today, and evaluate how we can help prepare them to operate within the legal and societal frameworks of tomorrow. A few years ago, as part of this work, we assessed that the current system of publishing IP addresses of non-logged-in contributors should change. We believe it creates risk to users whose information is published in this way. Many do not expect it—even with the notices explaining how attribution works on the projects, the Privacy team often hears from users who have made an edit and are surprised to see their IP address on the history page. Some of them are in locations where the projects are controversial, and they worry that the exposure of their IP address may allow their government to target them. The legal frameworks that we foresaw are in operation, and the publication of these IP addresses pose real risks to the projects and users today.

We’ve heard from several of you that you want to understand more deeply what the legal risks are that inspired this project, whether the Foundation is currently facing legal action, what consequences we think might result if we do not mask IP addresses, etc. (many of these questions have been collected in the expanded list at the end of this section). We’re sorry that we can’t provide more information, since we need to keep some details of the risks privileged. “Privileged” means that a lawyer must keep something confidential, because revealing it could cause harm to their client. That’s why privilege is rarely waived; it’s a formal concept in the legal systems of multiple countries, and it exists for very practical reasons—to protect the client. Here, waiving the privilege and revealing this information could harm the projects and the Foundation. Generally, the Legal Affairs team works to be as transparent as possible; however, an important part of our legal strategy is to approach each problem on a case by case basis. If we publicly discuss privileged information about what specific arguments might be made, or what risks we think are most likely to result in litigation, that could create a road map by which someone could seek to harm the projects and the communities.

That said, we have examined this risk from several angles, taking into account the legal and policy situation in various countries around the world, as well as concerns and oversight requests from users whose IP addresses have been published, and we concluded that IP addresses of non-logged-in users should no longer be publicly visible, largely because they can be associated with a single user or device, and therefore could be used to identify and locate non-logged-in users and link them with their on-wiki activity.

Despite these concerns, we also understood that IP addresses play a major part in the protection of the projects, allowing users to fight vandalism and abuse. We knew that this was a question we’d need to tackle holistically. That’s why a working group from different parts of the Wikimedia Foundation was assembled to examine this question and make a recommendation to senior leadership. When the decision was taken to proceed with IP masking, we all understood that we needed to do this with the communities—that only by taking your observations and ideas into account would we be able to successfully move through this transition.

I want to emphasize that even when IP addresses are masked and new tools are in place to support your anti-vandalism work, this project will not simply end. It’s going to be an iterative process—we will want feedback from you as to what works and what doesn’t, so that the new tools can be improved and adapted to fit your needs.

Questions

Over the past months, you’ve had questions, and often, we’ve been unable to provide the level of detail you’re hoping for in our answers, particularly around legal issues.

What specific legal risks are you worried about?

We cannot provide details about the individual legal risks that we are evaluating. We realize it’s frustrating to ask why and simply get, “that’s privileged” as an answer. And we’re sorry that we cannot provide more specifics, but as explained above, we do need to keep the details of our risk assessment, and the potential legal issues we see on the horizon, confidential, because providing those details could help someone figure out how to harm the projects, communities, and Foundation.

There are settled answers to some questions.

Is this project proceeding?

Yes, we are moving forward with finding and executing on the best way to hide IP addresses of non-logged-in contributors, while preserving the communities’ ability to protect the projects.

Can this change be rolled out differently by location?

No. We strive to protect the privacy of all users to the same standard; this will change across the Wikimedia projects.

If other information about non-logged-in contributors is revealed (such as location, or ISP), then it doesn’t matter if the IP address is also published, right?

That’s not quite the case. In the new system, the information we make available will be general information that is not linked to an individual person or device—for example, providing a city-level location, or noting that an edit was made by someone at a particular university. While this is still information about the user, it’s less specific and individual than an IP address. So even though we are making some information available in order to assist with abuse prevention, we are doing a better job of protecting the privacy of that specific contributor.

If we tell someone their IP address will be published, isn’t that enough?

No. As mentioned above, many people have been confused to see their IP address published. Additionally, even when someone does see the notice, the Foundation has legal responsibilities to properly handle their personal data. We have concluded that we should not publish the IP addresses of non-logged-in contributors because it falls short of current privacy best practices, and because of the risks it creates, including risks to those users.

How will masking impact CC-BY-SA attribution?

IP masking will not affect CC license attribution on Wikipedia. The 3.0 license for text on the Wikimedia projects already states that attribution should include “​​the name of the Original Author (or pseudonym, if applicable)” (see the license at section 4c) and use of an IP masking structure rather than an IP address functions equally well as a pseudonym. IP addresses already may vary or be assigned to different people over time, so using that as a proxy for un-registered editors is not different in quality from an IP masking structure and both satisfy the license pseudonym structure. In addition, our Terms of use section 7 specify that as part of contributing to Wikipedia, editors agree that links to articles (which include article history) are a sufficient method of attribution.

And sometimes, we don’t know the answer to a question yet, because we’d like to work with you to find the solution.

What should the specific qualifications be for someone to apply for this new user right?

There will be an age limit; we have not made a definitive decision about the limit yet, but it’s likely they will need to be at least 16 years old. Additionally, they should be active, established community members in good standing. We’d like to work through what that means with you.

I see that the team preparing these changes is proposing to create a new userright for users to have access to the IP addresses behind a mask. Does Legal have an opinion on whether access to the full IP address associated with a particular username mask constitutes nonpublic personal information as defined by the Confidentiality agreement for nonpublic information, and will users seeking this new userright be required to sign the Access to nonpublic personal data policy or some version of it?
1 If yes, then will I as a checkuser be able to discuss relationships between registered accounts and their IP addresses with holders of this new userright, as I currently do with other signatories?
2 If no, then could someone try to explain why we are going to all this trouble for information that we don't consider nonpublic?
3 In either case, will a checkuser be permitted to disclose connections between registered accounts and unregistered username masks?

This is a great question. The answer is partially yes. First, yes, anyone who has access to the right will need to acknowledge in some way that they are accessing this information for the purposes of fighting vandalism and abuse on the projects. We are working on how this acknowledgement will be made;the process to gain access is likely to be something less complex than signing the access to non-public personal data agreement.

As to how this would impact CUs, right now, the access to non-public personal data policy allows users with access to non-public personal data to share that data with other users who are also able to view it. So a CU can share data with other CUs in order to carry out their work. Here, we are maintaining a distinction between logged-in and logged-out users, so a CU would not be able to share IP addresses of logged-in users with users who have this new right, because users with the new right would not have access to such information.

Presuming that the CU also opts in to see IP addresses of non-logged-in users, under the current scheme, that CU would be able to share IP address information demonstrating connections between logged-in users and non-logged-in users who had been masked with other CUs who had also opted in. They could also indicate to users with the new right that they detected connections between logged-in and non-logged-in users. However, the CU could not directly the share IP addresses of the logged-in users with non-CU users who only have the new right.

Please let us know if this sounds unworkable. As mentioned above, we are figuring out the details, and want to get your feedback to make sure it works.

Next steps

Over the next few months, we will be rolling out more detailed plans and prototypes for the tools we are building or planning to build. We’ll want to get your feedback on these new tools that will help protect the projects. We’ll continue to try to answer your questions when we can, and seek your thoughts when we should arrive at the answer together. With your feedback, we can create a plan that will allow us to better protect non-logged-in editors’ personal data, while not sacrificing the protection of Wikimedia users or sites. We appreciate your ideas, your questions, and your engagement with this project.

October 2020

این بیانیه از بخش حقوقی بنیاد ویکی‌مدیا، مطابق با درخواست و برای صفحهٔ بحث نوشته شده‌است و مطابق با همان زمینه است. برای شفافیت بیشتر، مایل بودیم که شما بتوانید آن را در اینجا نیز مطالعه کنید.

سلام بر همگی. این یادداشتی است از تیم امور حقوقی. در ابتدا، مایل هستیم از همگی بابت نظرات متفکرانه‌شان تشکر کنیم. لطفاً بدانید که در بعضی مواقع، ما به‌عنوان حقوق‌دان قادر نیستیم تمام جزئیات تفکرات خود را به‌طور عمومی به اشتراک بگذاریم؛ اما نظرات و دیدگاه‌های شما را می‌خوانیم و این نظرات و دیدگاه‌ها در مشاوره‌دادن به بنیاد برای ما بسیار مفید هستند.

در برخی مواقع، با توجه به قوانین اخلاق حقوقی و اختیارات قانونی که بر چگونگی کنترل اطلاعات دربارهٔ کار در حال انجام از سوی وکلا حاکم است، ما ملزم به محرمانه نگه‌داشتن کار خود یا مشاوره‌های خود به سازمان هستیم. ما دریافته‌ایم که عدم توانایی ما در بیان دقیق تفکر خود و این که چرا ممکن است کاری انجام دهیم یا انجام ندهیم، می‌تواند در پاره‌ای موارد، از جمله همین مورد، ناامیدکننده باشد. اگرچه همیشه قادر به فاش‌سازی جزئیات نیستیم، اما می‌توانیم تأیید کنیم که اهداف کلی ما این هستند که همچنان که از پیروی بنیاد از قوانین قابل اجرا اطمینان حاصل می‌کنیم، بهترین کار را برای محافظت همزمان از پروژه‌ها و اجتماع‌ها انجام دهیم.

Within the Legal Affairs team, the privacy group focuses on ensuring that the Foundation-hosted sites and our data collection and handling practices are in line with relevant law, with our own privacy-related policies, and with our privacy values. We believe that individual privacy for contributors and readers is necessary to enable the creation, sharing, and consumption of free knowledge worldwide. As part of that work, we look first at applicable law, further informed by a mosaic of user questions, concerns, and requests, public policy concerns, organizational policies, and industry best practices to help steer privacy-related work at the Foundation. We take these inputs, and we design a legal strategy for the Foundation that guides our approach to privacy and related issues. In this particular case, careful consideration of these factors has led us to this effort to mask IPs of non-logged-in editors from exposure to all visitors to the Wikimedia projects. We can’t spell out the precise details of our deliberations, or the internal discussions and analyses that lay behind this decision, for the reasons discussed above regarding legal ethics and privilege.

We want to emphasize that the specifics of how we do this are flexible; we are looking for the best way to achieve this goal in line with supporting community needs. There are several potential options on the table, and we want to make sure that we find the implementation in partnership with you. We realize that you may have more questions, and we want to be clear upfront that in this dialogue we may not be able to answer the ones that have legal aspects. Thank you to everyone who has taken the time to consider this work and provide your opinions, concerns, and ideas.