Some data on external sites contain errors that can be corrected there. But in Mix n Match auxiliary statements cannot be changed - only new ones can be added. And given that, every time the "auxiliary matcher" is run, the IDs are linked incorrectly again. This is a problem for me in a couple of dozen entries in a few catalogues already. Periodically I find that items are linked by outdated data that cannot be changed. It would be great to have something like a cross button that could remove incorrect auxiliary statements from entries on MnM.
User talk:Magnus Manske
Jump to navigation
Jump to search
I add that this would be very useful also in my opinion.
Oops, I forgot I had already posted this here. But I also recently added another post as a possible patch solution due to overriding the current value - if not to NULL, then at least to a space, so that the value remains but it won't be used by the matcher: Topic:Y6vre34iz07033ao
See also User talk:Plexci#Wikidata:Database reports/Constraint violations/P5319 because of Mix'n'Match catalog for César Award person ID (P5319).
Are you all talking about individual entries, or "all inception (P571) for catalog xyz"?
I think that Solidest and I are talking about individual entries.
In my case, it would probably be even better to completely purge the property from all entries in the entire catalog. Because I usually reupload data for the entire catalog from scratch. But deleting from individual entries will probably still be used more often.
At least for Wikidata users who would be unpatrolled; preferably for everybody until it can be improved. See Wikidata:Administrators' noticeboard#Please revert June 20 edits by User:Jephtah Ogyefo Acquah for the latest problems, beyond what I can keep up with.
Deactivated until further notice
The duplicate authors merge inside the Wikidata game seems back: Special:Contributions/Awinsongyababa (and I fear a certain percentage of them could well be wrong). Personally I would suggest deactivating again.
Oh this is terrible - hundreds of people merges by naive Wikidata users once again! Magnus, please turn this off and keep it off, at least until somebody with experience can carefully review what it's doing.
I have also requested to turn off the generic merge game (Topic:Y8vvrrfnnminerl2) for the same reason.
I have deactivated both the "classic" and the distributed merge game.
Hi @Magnus Manske and @Epìdosis - thanks for deactivating, but is there some delay?See https://www.wikidata.org/w/index.php?title=Q57317983&diff=prev&oldid=2211826205 - this edit was from earlier today. ~~~~
Forgot to update the JSON file with the game list from the database. Fixed now.
Hi Magnus! Excuse me for disturbing. I see that some days ago you have suspended author merges in the New distributed game. I would now ask you to suspend also "Merge items" in the Distributed game; I am now undoing more than 200 wrong merges made thorugh it by a new user (Special:Contributions/Nkpelawuni) in less than half an hour and it's not the first case in the last months. I have also sometimes tried myself the game and I have experienced that the great majority of the proposed merges were wrong, so I fear that it is having a mostly negative impact on our data quality. Thanks in advance!
deactivated
mixnmatch:6059 is always updating, but never actually starting the scrape.
The catalog has been scraped
Not using Mix-n-Match. I ended using my own script and uploading via CSV due to the lack of developer response here.
Hi Magnus,
I've just added a new set for Te Papa agent IDs, which supersedes a set from 2017. Can the old set be taken down? Thanks heaps.
Yes, https://mix-n-match.toolforge.org/#/catalog/362 deactivated.
Thanks Epidosis! Just reopening this because I messed up the ids on the new dataset and think it'd be easiest if I loaded it fresh. Can you please also remove this one?
Hi Magnus, https://sourcemd.toolforge.org/index_old.php has not been working for the past week or so. Hoping it will be up again soon as it is such a fabulous workhorse. MargaretRDonald (talk) 20:26, 23 September 2023 (UTC)
I was afraid [https://www.wikidata.org/w/index.php?title=Q60055775&diff=2029624243&oldid=1852619792 this] was going to happen. Please please please do not offer people matches where (1) the family name is extremely common and (2) the coauthor lists for an author number in the thousands, including people with the same first initial and last name? Or find some other way to prevent this from happening.
Hello, I have imported mixnmatch:6299 and mixnmatch:6300 for Archnet site ID (P7323) and Archnet authority IDs (which don't have a property yet) using the import form. Both ended up having 0 IDs though the example entries in the preview looked fine. Do you have an idea what might have gone wrong? Thanks a lot in advance for any help or recommendations!
I manually ran an update, should be all there now.
Vielen tausend Dank!
Hi - I've just been spending many hours dealing with another few hundred duplicate-authors actions; this tool is *really* not ready for prime time. Latest example: https://www.wikidata.org/w/index.php?title=Q4757048&diff=2133123022&oldid=2118661421 - how can you merge "Andrew G. White" and "Alex White"? There have been far too many similar merges like this that I've had to revert - and I'm concerned that somebody's going to go in again and re-merge them if the tool is recommending this. Also I'm finding many cases of incorrect merges where the issue is likely that the wrong person was assigned to a paper by the other author in Wikidata, so your tool thinks they're the same person because of that. I had a case yesterday where a political scientist with the same name as a biochemist was set as the author of some biochemistry papers, and your tool then was used to merge them. What actually would have been useful in a case like that was to point out that those biochem papers had been likely assigned to the wrong person as author. Can you adjust your tool to do that? In any case until some significant fixes are in place this needs to be turned off ASAP.
By the way part of the problem (including the political scientist one) is that ORCID sometimes has these bad author assignments - some of the data in ORCID comes from services like Scopus that can be mistaken on things like this.
Another common issue I'm seeing is two different authors with the same family name and same first initial co-authoring a paper. This often happens with husband-wife teams; also other family relationships can result in paper co-authoring, and even just being from the same region of the world area two individuals may be likely to share a surname. Your duplicate authors game will always try to merge these cases, because of course being co-authors on a paper means they are both co-authors with the other authors on that paper. Instead I think it should catch cases like this as a sign that they *are not* the same person. Having both wikidata id's as co-authors on the same work should be an indicator that they are distinct, not the same. I know there are exceptions where real duplicates exist, but at the least it should be handled differently from other cases where they are never co-authors.
Dear Magnus,
can we please just *turn this off* for new users? I just spent a couple of hours reviewing this user's edits - a totally new Wikidata user, making 57 merges with the "distributed game - duplicate authors", of which I had to revert 27 (i.e. 53% good, 47% bad merges). So many of these merges were obviously wrong - completely different given names etc. Some were subtle. None should have been done by a completely fresh wikidata user. How do new users even run into this? In any case, merging should not be treated as a game like this. Bad merges have severe knock-on effects, and if there's a data problem causing your game to think two people should be merged, a good portion of the time merging is not the right solution.
Hi Magnus! If it isn't too much trouble for you, can you please deactivate this catalog? Unfortunately I just messed it up when I uploaded it.
Deactivated