Profile avatar
fredner.org
Digital humanities, US literature, data about literary cultures VAP at the University of Richmond CV & more: https://fredner.org
129 posts 990 followers 395 following
Prolific Poster

“The US of AI,” public draft of a talk given yesterday at Princeton. drive.google.com/file/d/1O2qk...

We’ve posted a job ad to join our team at LC Labs. I am very proud and excited about the work we have planned. Please share. I’ll also mention that we are part of the legislative branch and this is a partner-supported project. www.usajobs.gov/job/832669800

Introducing olmOCR, our open-source tool to extract clean plain text from PDFs! Built for scale, olmOCR handles many document types with high throughput. Run it on your own GPU for free—at over 3000 token/s, equivalent to $190 per million pages, or 1/32 the cost of GPT-4o!

Announcing the release of Common Corpus 2. The largest fully open corpus for pretraining comes back better than ever: 2 trillion tokens with document-level licensing, provenance and language information. huggingface.co/datasets/Ple...

Big ups to @signal.org @meredithmeredith.bsky.social for allowing usernames instead of phone numbers last year (signal.org/blog/phone-n...). Washington basically runs on Signal right now and that one change has made reporting and whistleblowing way more private and secure.

Now you can check out this video game history museum online

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! It demonstrates that our recipe, which includes RVLR scales to 405B - with performance on par with GPT-4o, & surpassing prior open-weight post-trained models of the same size including Llama 3.1.

I use the "Buffalo buffalo..." sentence to teach students the distinction between types and tokens. Delighted to discover this diagram on the Wikipedia article this semester: en.wikipedia.org/wiki/Buffalo...

Apple released iPhone and Mac updates yesterday that activate AI features by default. Tech companies are blurring the distinction between using a computer and using AI. As a result, it will get harder and harder for students to tell the difference.

A (soft) launch for a new writing venture today!

RIP Jules Feiffer. Among many other things, Feiffer illustrated "The Phantom Tollbooth," which was THE book that got me interested in reading as a kid wapo.st/42nr17i

It's publication day! Available in print and Open Access mitpress.mit.edu/978026255091... A book about how digital social reading apps are changing and nurturing the way we read. I talk about Wattpad, Goodreads, AO3 and more. @mitpress.bsky.social @unigroningen.bsky.social @gronlp.bsky.social

RIP David Lynch www.youtube.com/watch?v=vI8c...

Dangers and Opportunities of Technology program is going international, with a focus on AI! So excited to see this collab with the UK finally in place. Read all of the linked materials carefully! March 20 deadline with AHRC serving as lead agency, and NEH leads for Sept 2025 deadline.

If you're working in computational literary studies, do please submit to the 4th Annual Conference of Computational Literary Studies, Krakow 2025 (hybrid event) jcls.io/site/ccls2025/

Looking to teach with Humanities Data in R?The second edition of our book is now out! Updates include tidyverse, new data sets, latest code techniques, and more! Available through many libraries. We also have a digital teaching component. DM or email for access! #digitalhumanities #datascience

Preparing for snow by taking turns on the radiator

T.S. Eliot died on this day in 1965. This photo of his mustard, from the Faber staff fridge in Russell Square, was taken not long afterwards

Paul Lafargue tagged every bench on Memorial Bridge:

It's out of copyright now -- Nella Larsen's "Passing." To celebrate, I put together a simple digital edition with web, TXT, PDF, and EPUB versions. Also wrote a brief intro. scalar.lehigh.edu/african-amer...

US Cybersecurity and Infrastructure Security Agency urges people to use Signal instead of SMS messages OR phone calls. Good stuff in this guide if you're interested data privacy and security: https://www.cisa.gov/sites/default/files/2024-12/guidance-mobile-communications-best-practices.pdf

Well, this sure looks like a must-read for 19thC Americanists.

This study suggests that the total number of tokens from books in text-based models are less than 10% of the tokens from encyclopedias (including but not limited to Wikipedia). www.technologyreview.com/2024/12/18/1...

Way More Than Very Happy to announce this book, co-edited with the incomparable Isabel Galina, and co-authored with 70+ amazing folks from 5 continents: "The Routledge Companion to Libraries, Archives, and the Digital Humanities" www.taylorfrancis.com/books/edit/1... Tell your librarian!

What's in an anthology? @jdporter.bsky.social, Price Lab DH Specialist, & @fredner.org (with assistance from David McClure & the @stanfordlitlab.bsky.social) built a relational database of all 464 authors & 3,374 works published across the ten editions of the Norton Anthology of American Literature.

Publishers include: Cambridge UP De Gruyter Brill Oxford UP Sage Taylor & Francis / Informa Wiley Link to the table: sr.ithaka.org/our-work/gen...

How can we visualize what a book ISN'T talking about? With an anti-tag cloud! See the most common English words that are never mentioned in a text. www.bewitched.com/demo/anti/

My new book has its own page on Stanford's website-- www.sup.org/books/digita... It will be published in June!

spent this afternoon reading some early US history (as one does) and was reminded of the important fact that these people were almost constantly drunk as hell

Since 1965, the National Endowment for the Arts has funded 3700+ writers! A team of @mcgill.ca student researchers and I have compiled a comprehensive database of all of them, a searchable, citable, resource for anyone interested in American Literature! data.post45.org/posts/nea-cr...

Also possible to set this as your default search method in Chrome or Firefox: tenbluelinks.org#chrome-windows

I think pleasurable reading is an important part of high school English

Happy public domain drop to all who celebrate www.hathitrust.org/press-post/2...

Release of “nearly one million public-domain books” including “books scanned as part of the Google Books project that are no longer protected by copyright.” www.wired.com/story/harvar...

Has anyone's institution set up a computer lab that can be taken offline? Use case: for students to take timed exams where they write code and/or prose without internet or LLM access. (Yes, for this to work, students would have to place their phones out of reach during the exam.)

tl;dr: Use Signal wapo.st/3D5L3sC

This one is hard. ❤️

close reading archive launch! “surveys sundry print, archival, and digital resources, concentrating on Anglo-American literary studies that overtly reflect upon the practice of “close reading”” www.closereadingarchive.org/acknowledgme...

Belmont Library getting in the holiday spirit:

Unpopular opinion: Literary studies needs to figure out how to teach reading for pleasure. The quoted editorial cites a recent National Literacy Trust study showing an 8.8% decline in reading for pleasure among kids aged 8-18 *in one year*: literacytrust.org.uk/research-ser...

so this is what it feels like to be the 0.1%

“They said it could not be done”. We’re releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.

There are many ways to identify texts that seem ahead of their time. Our CHR 2024 paper asks which measures of textual precocity align best with social evidence about influence and change.