Talk:Community Wishlist Survey 2023/Edit-recovery feature

Latest comment: 6 days ago by JrandWP in topic Opt-out

Investigations and your input edit

The CommTech team is reviewing any investigations, discussions, patches that have happened around this wish to determine what is next.

We have would like you to answer some questions as follows:

  • How long do we need to save the data for the auto-save functionality (Keeping in mind legal matters, as legal implications are reduced by reducing the amount of time we store edits)?
  • What should we store in the database to be able to make autosave functionality work? I.e:
    • Edits
    • Revisions
    • Pages
    • Text

Please leave your responses on this talkpage if you have any. –– STei (WMF) (talk) 15:51, 19 June 2023 (UTC)Reply[reply]

My hope was to leverage the existing stashed edits infrastructure, but increase the TTL (time to live) so that it persists for say, 10, maybe even 30 minutes. This avoids the issues of using localStorage (which can't hold but so much data), and also legal implications as the TTL is so short, and sessions are not sharable.
Another idea is to have longer-term storage in a database table (no longer than 90 days, as per legal recommendation), but provide a way for the content to be moderated. This is nice because your unsaved edit can be recovered across different devices. I am still however concerned about this especially with all the opposition we saw on the private sandbox wish.
So in addition to the above, my questions for voters of this wish are:
  • What is more important to you -- recovering edits similar to how VisualEditor works, or saving "drafts" per-se, provided we address legal/moderation concerns?
  • Do you want to be able to recover edits across different devices?
MusikAnimal (WMF) (talk) 17:43, 21 June 2023 (UTC)Reply[reply]
In context of an auto-save feature that would endure a browser crash, with single interrupted edit (assuming it's stored on the client side).
  • How long: until one restart the browser and is recognized as logged in - show notification/alert/popup with a question if one want to continue editing or remove saved edit. If it's not a browser crash, then the same amount as current recovery solution.
  • What to store: I don't know what any of the options means exactly, but it should definitely keep the changes that the user did (mostly in the VE, in my case). So I guess it's the Text (diff against the unmodified article), so even if there was let's say a restoration problem, you could still copy the saved text.
  • When to auto-save: on switching between editor modes, on preview, when there's a significant number of new changes made (or if the out of memory is detected - my specific case).
  • What is more important: I guess the easiest one to implement (that survives the browser crash)?
  • Recover edits across devices: Doesn't matter to me.
Other notes: For my needs the "auto-save" could probably be a prompt to periodically ask to save current progress / make copy as a local text file, or as a draft or sandbox if saved online (probably as a special page, that would suppress some features like ex. adding page to categories). MarMi wiki (talk) 14:01, 4 July 2023 (UTC)Reply[reply]
@MarMi wiki: You raise an interesting point about being logged in or not. Do you mean that the autosave feature should only be available when logged in? I've been looking into the possibility of client-side storage (with indexedDB), and that would mean that the data would actually be available to anyone with access to the device. It would make sense to clear it when the user logs out, but if the feature is available when not logged in then maybe we'd want a way for the user to clear it (although, it's the same situation with cookies and localStorage, which I guess we assume people deal with that by using private browsing windows where necessary). A shared device is already pretty hard to make secure between the multiple people using it, and I does seem like responsibility for that lies with the browser and operating system, not individual websites. Sam Wilson 03:12, 5 July 2023 (UTC)Reply[reply]
@Samwilson:
Logged in: I said that because I edit when logged in (at least doing the longer edits, when the auto-save would be the most useful). Also, after browser crash, when you restart browser, you could be logged out from Wikipedia (I guess browser in some cases can lose/corrupt/erase cookies), so the auto-save recovery process would probably trigger after logging in again anyway.
It may be easier (or not) to implement auto-save for users with account than for those without it (shared/dynamic IPs). MarMi wiki (talk) 10:14, 5 July 2023 (UTC)Reply[reply]
@MarMi wiki: That makes sense. It sounds like it's best if we store the recovery data per-user, and that way logged-out vs logged-in (including multiple users on the same device) get their own saved data and can't accidentally clobber each others. It won't mean the data is actually private though (i.e. anyone can just open the dev tools in the browser to look at it), so we'll have to be careful about how to let people know that just because they have to log back in to recover it doesn't mean that the data is hidden. Sam Wilson 08:43, 7 July 2023 (UTC)Reply[reply]
I never needed an auto-save feature, but it would be a nice way to calm my mind during edits which turn out to be longer (at the moment I simply copy the text into a text editor, when it gets larger)
  • Duration: I guess as long as it takes to restart a browser/PC (+a bit of buffer) should be fine in most cases
  • Content: For my personally a copy of the page I was editing would be enough (no need for any convenience features)
Nuretok (talk) 16:26, 5 July 2023 (UTC)Reply[reply]
A user loses their connection for whatever reason. Others posted above suggesting until the user can restart their browser or system. I think these days the failure is more likely to be in the connection than it is in their local system. It's possible the browser locks up, but it seems more likely that they lose their IP address, change cell towers, experience some denial-of-service of their Internet connection, etc. This past week I was traveling and only had a good connection if I held my phone up above my head. Resting the phone anywhere would drop to one bar and lose data service.
The cause of the problem influences the solution. I think saving locally is more likely to address common issues. 5 MB is a lot of text, and the user can be warned if they approach or exceed the 5 MB limit so they know their changes aren't being auto-saved. If this also solves legal issues, it seems like a better solution.
If storing on the server, you are assuming a good connection. I think you can only store on the server if you also store locally. That way, you cover either a local system failure or a network/connection failure. If saving on the server only, you need to notify users when the connection isn't saving. Google does this on some (maybe all?) of their autosave applications.
How long? My initial thought was 24 hours. Seven days or ten days would also work. 10 minutes assumes they can restore their connection quickly. I'm not sure that's a valid assumption. Certainly there are Internet outages for even well-connected users that last for several hours. I've often been driving (technically riding, someone else was driving) and lose my Internet connection for a couple of hours out in the "boonies" somewhere. The data would only need to be retained until the user's next connection and a response to whether they want to resume the saved edit or discard it (unless legal needs to retain it longer for whatever reason).
I don't see this as being in the scope of an offline editing tool. Saving the current edit text is as far as I would want to go with it. Beyond that, I think you add a lot of implementation complexity without adding immediate value. Once I know I don't have a good connection, I can either stop editing or use some other tool to save my work.
Dave Braunschweig (talk) 21:46, 5 July 2023 (UTC)Reply[reply]
This is largely what I had expected too. The situations where I would want it are where the Internet has gone down or my browser has crashed, but I had been typing a bunch of content without copy/pasting into a local file. So for my use cases, at least, storing the plain wikitext that’s currently in my edit window would suffice (and probably zips to smaller than saving a richtext HTML blob from VE, though I’m both guessing and solutionising here, sorry!) My other use-case, incidentally is that Wikidata’s UI doesn’t seem to mark forms as dirty, so my tab-suspender extension blanks out my work. But that’s definitely stretching scope 😛)
I had largely anticipated localStorage would be the easiest solution but I had completely overlooked the size limit. Given it’s 5 MiB for all of localStorage, might a simpler MVP be to use localStorage and alert users if they are nearing or exceeding that threshold? I would certainly favour something like “here is a first attempt, let’s iterate on that, possibly next year” over “this has been on the wishlist for years, so we need to get it ‘right’ before rolling anything out” — OwenBlacker (Talk; he/him) 17:59, 6 July 2023 (UTC)Reply[reply]
@OwenBlacker: Thanks for pointing out the Wikidata issue. I've been testing local-saving with multiple content types (e.g. CSS, and ProofreadPage's headers/footers) but hadn't thought about the more complicated forms. I think it'd be good to explicitly exclude Wikidata, as it's not like there's normally very much data there to recover anyway, and the time spent editing before saving is usually pretty small compared to long-form wikitext pages. Sam Wilson 08:49, 7 July 2023 (UTC)Reply[reply]
@Dave Braunschweig: The 5MB limit is avoided if we use indexedDB. That means we can store as much as we want really, although I think it should purge old recovery data after e.g. 90 days (with or without a message to the user, I'm not really sure). It means it can also give users an overview of what data is stored, as in a list of all pages with any data (if we want to do that). Sam Wilson 08:57, 7 July 2023 (UTC)Reply[reply]
@Samwilson: Avoiding the limit is fine. But consider the many devices this might occur on. Using 10s or 100s of MB of computer storage shouldn't be an issue. Using that much storage on my mobile devices could be a problem. I don't see a need to save anything locally once it is saved remotely. And I don't see a need to save recovery data once the user reconnects. "Hey, you have unsaved changes. Do you want to save them now or delete them?" Done. Only retain what you need, and only for the lifetime necessary to meet the need, unless legal is indicating otherwise. At least that's my perspective. -- Dave Braunschweig (talk) 01:33, 8 July 2023 (UTC)Reply[reply]
@Dave Braunschweig: Yeah, we need to make sure it's only storing what's required. Certainly, the local data can be deleted after the user has clicked publish (or rather, after the page has loaded so we know it was successful), and if they click cancel. I think there might still be quite a few ways in which they could end up with data saved that isn't really useful to them, so I wonder if it'd be good to have a way of showing what is stored. The proof of concept that I'm working on at the moment has a button that says "Last recovery point 12:34PM" and opens a popup that shows all pages that have any recovery data saved, along with a means of deleting each. Sam Wilson 03:55, 10 July 2023 (UTC)Reply[reply]

Restoring on different devices edit

I'd love to know what people think about whether it should be possible to restore a lost edit on a different device. It's a bit of a fundamental aspect to the feature, because if we do want to restore elsewhere then we need to be storing the data on the server. It also means that if the network connection is lost then the recovery data can't be stored. Sam Wilson 03:59, 10 July 2023 (UTC)Reply[reply]

We're settling on the MVP being to only be able to restore on the same device. This doesn't necessarily exclude further work in the future to make cross-device restoration possible, but for now it seems like it's the most important. Sam Wilson 08:23, 14 July 2023 (UTC)Reply[reply]
Honestly i wouldn't expect my lost edits to be saved across devices. Jorm (talk) 00:07, 22 July 2023 (UTC)Reply[reply]

Restoring separate sections edit

Tracked in Phabricator:
Task T344410

Does anyone have an opinion on how it should handle section editing? At the moment it stores recovery data for each section separately, which means any section can be recovered independently (as well as whole-page editing). This works well, I think, except for when a page is edited by someone else before you recover, and they add or remove a section (that stuffs up the section numbering). It seems like it might make sense to throw away all recovery data for a page and all its sections whenever any section or the whole page is saved (or editing cancelled). Does that sound correct? The idea being that if somone's saving a page or part thereof they're saying that they're done with editing. Sam Wilson 05:11, 25 July 2023 (UTC)Reply[reply]

There is more discussion about this topic in phab:T344410. Sam Wilson 04:14, 28 August 2023 (UTC)Reply[reply]

Why not use LocalStorage? edit

(Comment moved here from project page.)

Thanks. As web developer I do not understand the anti-localStorage concerns. It seems that for all solutions we indeed need to handle collisions in case of multiple tabs, and also for all solutions we indeed should auto-clean stuff after some time. A web page content is just some Kilobytes and the localStorage allows megabytes. We are not supposed to save every "stub" but just the most recent ones. So, I would be inclined to re-consider the localStorage, or share more concerns. --ValerioBozz (talk) 09:40, 26 August 2023 (UTC)Reply[reply]

@Valerio Bozzolan: (I've moved your comment here from the project page; hope that's okay.) The main reason to avoid localStorage is that it's limited to 5 MB, and has to be shared between all usages on a site. You're right that each page isn't that large, but some can be half a megabyte or more, and so it's safer to not use up that quota. There's no limit to the number of pages that might be stored by Edit Recovery. As for collisions between restoring in different tabs: I'm not sure I understand the concern. Do you mean that someone might be editing the same page in two different places in two tabs, and then close both, and then expect to be able to recover each separately? The way that'd work now is that the most recently-edited one would win out. Sam Wilson 04:13, 28 August 2023 (UTC)Reply[reply]
Thanks Sam. Now I understand the use-case for having more than 1 page to be recovered. I'm not used to write more than one "big" edit in more than one page at the same time but indeed it's something that could be proactively supported. Premising that this seems over-kill to me, but - if I understand correctly - this may become a super-loved killer feature for offline contributors. ValerioBozz (talk) 08:34, 28 August 2023 (UTC)Reply[reply]
@Valerio Bozzolan: Actually, I'm not sure we really evaluated the idea of only being able to restore a single page; that certainly would've reduced the storage requirements. However, I think the multiple-page approach is good, and we've done a fair bit of it now so I think it's best to carry on. I'll be interested to see how loved this becomes! I sort of feel like it's something that'll disappear into the background and not really be noticed. :) Sam Wilson 09:19, 28 August 2023 (UTC)Reply[reply]


How did testing Edit-Recovery on beta go? edit

I tried it without logging in on a heavily-modded-out version of Firefox and it worked just like it's supposed to. Same with a stock install of Brave. Neat! Did not seem to work with a pretty stock install of Safari (using the visual editor) when I was logged out, but worked fine when I was logged in. Also worked fine when I was logged OUT but using the source editor. I tend to mainly use the Source editor so it's possible I'm doing something weird with the visual editor. Jessamyn (talk) 01:16, 26 October 2023 (UTC)Reply[reply]

@Jessamyn: Thanks for testing it. You're not doing anything weird — this feature is currently only for the wikitext editor! Visual Editor does have its own recovery system but that only currently works when you stay within the same tab (e.g. navigating away and back again, whereas we want Edit Recovery to be more robust than that, and work if you quit the browser etc.). Sam Wilson 05:35, 26 October 2023 (UTC)Reply[reply]
Speaking of VisualEditor’s recovery system, that displays a notification when restoring. I’d find it useful in the wikitext version as well, first to reassure the user that the edit has been recovered, and second to make them aware that the edit window already contains changes compared to the published version, which they may want to undo before they start editing and it’ll get much more difficult to tell new changes and restored changes apart. —Tacsipacsi (talk) 21:25, 26 October 2023 (UTC)Reply[reply]
@Tacsipacsi: The notification feature is very nearly done! It's in code review at the moment. It'll look something like the screenshots here: phab:T342721#9260675. It's got buttons for showing the diff, and discarding the recovered data. Sam Wilson 00:33, 27 October 2023 (UTC)Reply[reply]
The notification is now done and can be tested on Beta. See what you think. Sam Wilson 06:07, 7 November 2023 (UTC)Reply[reply]

Opt-out edit

While I find this project generally quite useful, I personally more often abandon edits intentionally by closing the browser tab than unintentionally (unintentional abandonment includes accidentally pressing Ctrl+W, power outage, browser freeze etc.), so I don’t want to waste my disk space and be annoyed by auto-restored edits that are old and no longer relevant. Could there be a per-user opt-out? —Tacsipacsi (talk) 21:38, 26 October 2023 (UTC)Reply[reply]

@Tacsipacsi: That sounds interesting. Would you want to be able to turn it on and off per-page or per-session, or is it more that you'd just never want it on? Would a checkbox in Preferences be okay? One idea we've got in the works at the moment is to add a Special page to list all stored data (on all pages), and maybe on that page we could have a on/off toggle for the feature. Sam Wilson 01:03, 27 October 2023 (UTC)Reply[reply]
@Samwilson: I think a checkbox in preferences would be the most useful. I can’t imagine never wanting to have edit recovery on a given page, but wanting to have it on other pages, so per-page doesn’t really make sense. If I want to not have edit recovery for a given session, I can just press the Cancel link afterwards to cancel the recovery for that time (although I’m not sure how intuitive it is; it’s probably worth being mentioned in the general information section you envision on Special:EditRecovery), or use Special:EditRecovery or the notification popup to discard afterwards. —Tacsipacsi (talk) 13:39, 28 October 2023 (UTC)Reply[reply]
@Tacsipacsi: Yeah all good points. I think the special page will be able to contain lots of this info. I've created task T350653 to track the addition of the preference. Sam Wilson 06:05, 7 November 2023 (UTC)Reply[reply]
Or should we have a back + forward button like in Microsoft Word, Excel or code editors? JrandWP (talk) 00:40, 15 November 2023 (UTC)Reply[reply]
@JrandWP: That sounds interesting, but I'm not quite sure of what you mean. What would backwards and forwards do in the context of enabling/disabling Edit Recovery? Where would the buttons go? Sam Wilson 01:39, 15 November 2023 (UTC)Reply[reply]
Like having a back + forward button to rollback changes that you are making... JrandWP (talk) 03:01, 15 November 2023 (UTC)Reply[reply]
 
@JrandWP: The current way of rolling back changes is via a notification (see example at right) that is shown when the edit form is opened. Does this match your idea? There's only a single edit saved for recovery at any point, so there's no way or need to navigate between different versions of unsaved data. Sam Wilson 02:16, 22 November 2023 (UTC)Reply[reply]
So you can discard your changes and this is a good thing. JrandWP (talk) 03:11, 22 November 2023 (UTC)Reply[reply]
Return to "Community Wishlist Survey 2023/Edit-recovery feature" page.