Comrade:CriticalResist/sandbox/Library stuff

Some useful regex functions

You can use search and replace with regex in the visual editor, just tap Ctrl+f on your keyboard, then click on the icon representing (.*) — that's the regex (regular expression) function, which will interpret your search string as regex.

Unfortunately you can't use regex in the replace function but anyway, here are some cool regex functions you can use to help you on pages. Just copy the "regex code" row into the search bar (yes it looks like nothing, that's normal)

Regex code What it does Use for
\[(\d+)\] Detects references like [12] Transform plain text references into formatted MediaWiki references (ctrl+shift+k to add a reference). Leave replace field blank. Will probably leave behind extra spaces so run the next function.
\s{2,} Detect two spaces or more Search for this code, add a space in the replace field, then press replace all and it will correct all possible extra spaces to just one.
\s(?=[.,;:!?)%]) Remove space before punctuation Removes a space before punctuation, like . or ? or ;

Be careful before clicking replace all, you should look through instances one by one.

\s-{1,2}|\s\u2013 Detects a hyphen or two OR an en dash after a space This can indicate incorrect use of an em dash —

Will detect either - or --, which are sometimes incorrectly used. Replace all with em dash (with a space before but not after)

Will also detect the en dash (–), which should also be replaced.

(?<!\s)\d+\s*[.\-\)] Detects improperly formatted ordered lists (e.g. 1), 2), etc) We should use mediawiki lists instead of plain text. This detects such a case. Use only to find instances, then apply the list format manually.
page \d+ detects "page X" where X is any number Sometimes in text imported documents online, there will remain a "page X" (page 5, page 12...) in the text; these come from the original PDF. You can quickly remove them with this code.
(?<![.,;!?])\n Detects superfluous line breaks When importing from PDF, you will usually have extra line breaks that shouldn't be here. This code detects most of them (new lines NOT preceded by punctuation that would normally warrant a new line). Replace with an empty space.

Note: detecting line breaks doesn't work in MediaWiki. Import your text to say Google Docs, use their search function, then import result to ProleWiki. You still have to go through the text manually but it saves a lot of time and cramps.

[.,;:!?][A-Za-z0-9] Detects if a character begins right after punctuation Will detect e.g. Hello.How are you? -- there should be a space after the period. You have to fix it manually unfortunately. Will detect uppercase and lowercase A-Z as well as digits 0-9.

Tools to help import and format works

https://pandoc.org/index.html Pandoc is a free downloadable app that converts any syntaxed document to another syntax. If you have a work in a strange syntax such as Markdown or copying from an HTML source or even EPUB (e-book format), you can quickly convert the document to MediaWiki and then just copy and paste into ProleWiki.

OCR readers: OCR stands for Optical Character Recognition. These types of tools allow you to take a picture document (like a scanned book as a JPG) and it will scan the image to recognize the characters and turn the picture into text that you can select, edit, etc. Some of them work better than others, but I don't think any has 100% accuracy (especially on older books where it might be difficult to recognize faded out characters or the paper is stained).

Online libraries other than marxists.org

Marxists.org is ran by trots and they've been known to edit stuff to fit their bias -- notably in Mastering Bolshevism from Stalin, whole sections are missing and they give the work a completely different name. They've also removed portions and footnotes from Progress Publishers (Moscow) books.

As much as possible, we shouldn't use them. You simply cannot know if their edition of a book will be truthful or not. They also rely on old translations for many texts, while I've found that newer translations tend to flow better.

Here are libraries we can use other than marxists.org. Make sure their own copies don't come from the MIA though obviously.

Website Content Formats offered
http://www.marx2mao.com/ only from Marx to Mao as the name implies but also has some stuff on other topics. Better quality than MIA. Plain text, PDF
https://libgen.is/ PDFs if you already know what you're looking for PDF
http://ciml.250x.com/archive/index.html never tested it, only has Marx, Engels, Lenin, Stalin, Hoxha (it's a hoxhaist website) Plain text, PDF
https://redsails.org/categories our comrades at Red Sails also have a library (look under Authors); they also say where they source their works from. Plain text
https://openlibrary.org they have books on everything and from everyone, including marxist authors. However they have to use a "borrowing" system by law, and you have to make an account. Maybe use them to find a book, then download it from libgen. Plain text, PDF
https://www.redstarpublishers.org/ lots of different books and documents PDF
https://www.bannedthought.net has mostly documents about countries, but some books can be found there as well PDF
http://hiaw.org/defcon6/works/cw/index.html Has the M&E letters which are otherwise copyrighted by a patent troll. Plain text
https://espressostalinist.com/the-real-stalin-series/ this page collects excerpts and collides them into topics; you can look for the original book on libgen. Could help you find books to add. Plain text
http://www.desmondgreavesarchive.com books from Desmond Greaves, Irish historian on communist figures from Ireland. PDF
http://www.korean-books.com.kp/en books from the DPRK, in English PDF
https://www.michael-parenti.org/articles Articles from Michael Parenti, from his official website Plain text
All works from Marx&Engels, Lenin, Stalin and Mao are available in their respective Collected Works, which can easily be downloaded from marx2mao for example as PDFs. PDF