thepit.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Mastodon instance populated by the denizens of #bikeDC, #bikeVA, #bikeMD, and various other friends and communities. Share opinions, news, photos of your bike, food takes, and memes. There may be some horsing around. In this house, we #bancars. All hail The Pit!

Administered by:

Server stats:

63
active users

#pdf

10 posts9 participants0 posts today

👉 Don't bury your research data in #PDF files. linkedin.com/posts/jonippolito

💽 Put raw data in data repositories and use open formats to save them and make them reusable. #CSV / TSV are the perfect format for tabulated data.

www.linkedin.com#archives #collections #data #digitalcuration #digitization… | Jon IppolitoCan AI save PDF from itself? Probably not, so stop using PDF to archive your documents. PDFs are not easily accessible, hide their dependencies and security vulnerabilities, and rely on a 700-page spec developed by a proprietary vendor rather than an open standards body. Most importantly, they're a bitch to extract data from. (Have you tried copy-pasting text from a two-column PDF?) Along with bitmapped PDFs that are basically images, there are even searchable PDFs whose creators have accidentally or deliberately mixed up the letters in the font table to make them non-copiable. These failure points are why Arxiv started giving readers the option to read papers in HTML. This sounds like a good job for AI, and in fact Mistral just launched an API for converting PDF to the simple, readable Markdown format. Unfortunately, independent tests have showed poor results https://lnkd.in/eyktiSaa. When you upload a PDF as background for your Custom GPT or NotebookLM podcast, this poor accessibility just compounds potential errors at inference time. But it's a lot worse when a PDF is consulted in a life-or-death situation. Optical LLMs have been caught mismatching column headings and lines of data. Imagine a doctor who uses AI to consult a PDF table of drug protocols and prescribes 500mg of a potentially toxic chemotherapy agent instead of 50mg. Especially in science, the value of information is the ability to re-use it. Yet archivists in the digital era have spent way too much time figuring out how to get data into a repository compared to getting it out. As I've said before, a collection of PDFs isn't an archive; it's a tomb. #Archives #Collections #Data #DigitalCuration #Digitization #DigitalPreservation #AIethics #GenerativeAI #GenAI #LLM #MachineLearning #OCR

Why I 🧡 the web.

An offline-capable Notepad (also a Progressive Web App) that's been operating for 9 years. 🗒️

👉🏻 Dark mode.
👉🏻 Privacy-focused
👉🏻 Full-screen mode
👉🏻 Supports monospaced and dyslexic #fonts.
👉🏻 Floating window
👉🏻 Download notes as plain text, #PDF, and DOCX.

notepad.js.org/

notepad.js.orgNotepad - Offline capableAn offline capable notepad powered by ServiceWorker. It's quick, distraction-free, dark mode enabled, mobile compatible(Android, iOS) and minimalist in nature.

"Police officers and employees misusing access to police database now account for over half of all cybercrime pros-ecutions in the UK. The harms this can cause are considerable.
Yet police continue to call for encryption to be weakened to allow for greater access to communication data."—Alice Hutchings [Cambridge Cybercrime Centre]

Police Behaving Badly >

cl.cam.ac.uk/~ah793/papers/202

Notre version 1 de la politique de formats #BnF ne disposait pas de fiche sur le format #PDF. La nouvelle en aura une, et en voici la teneur dans une version encore à valider :

github.com/hackathonBnF/Fiches

Il y aurait encore beaucoup plus à dire sur ce format mais il faut bien commencer quelque part. J'ai l'audace de penser qu'elle apporte un petit quelque chose - elle est en français, elle est assez pratique et elle tente d'être abordable.

Description des formats de fichier. Contribute to hackathonBnF/FichesFormat development by creating an account on GitHub.
GitHubPDFDescription des formats de fichier. Contribute to hackathonBnF/FichesFormat development by creating an account on GitHub.

LibreOffice hat in Version 25.2 den Export von barrierefreien PDF-Dokumenten nochmals verbessert.

Das habe ich mir zum Anlass genommen, um im Detail zu zeigen, wie man zu einem gültigen PDF/UA kommt.

Wie muss ein Dokument aufgebaut und eingerichtet sein?
Wie läuft der Export und wie kann ich prüfen?

Steht alles im neuen Blogpost …

#LibreOffice #PDFUA #PDF #Accessibility #accessiblePDF #Barrierefreiheit #BFSG #EAA

alexandra-oettler.de/2025/tipp

www.alexandra-oettler.deTipps für barrierefreie PDF-Dateien aus LibreOffice • Redaktionsbüro Alexandra OettlerLibreOffice, PDF/UA, accessibility
Replied in thread

@cstross Wow! This is a great example of the power of virtual platforms, #emulators, and also disciplined state control in a programming language. The fact that you can run this inside a #pdf is also instructive — yet another attack vector.
Also, I’ll bet there are many packages out there to do cross-compilation from #RISC-V to other machines. Remind me never to use Adobe Acrobat.