Profile avatar
lemire.bsky.social
Daniel Lemire is a Computer Science Professor at the University of Quebec (TELUQ). Daniel Lemire ranks in the top 2% of scientists worldwide according to Stanford University/Elsevier's 2024 ranking.
75 posts 1,126 followers 7 following
Prolific Poster
Conversation Starter

21st Century C++ By Bjarne Stroustrup 1. We are moving to modules (replacing the headers and includes). 2. Use std::span instead of pointer addressing. 3. Adopt concepts. 4. Profiles! cacm.acm.org/blogcacm/21s...

Thread-safe memory copy lemire.me/blog/2025/02...

Implementing urlpattern is way, way more difficult than you think it ought to be !!!

My belief is that programmer time is precious. We should spend time... Fixing bugs. Adding new features. Improving the performance. Patching code to silence false positives from disabled-by-default static analyzers is wasteful. I do not mind if others do it, but I avoid such work.

Regular expressions are the code equivalent of "with great power comes great responsibility" #regex

Regular expressions can blow up! lemire.me/blog/2025/01...

Thrilled to announce our latest publication in Frontiers in Artificial Intelligence! Our work is entitled "Shaping Integrity: Why Generative Artificial Intelligence Does Not Have to Undermine Education." www.frontiersin.org/journals/art... #AIinEducation

The simdutf library is used by many important systems including major Web browsers. We support SSE2, AVX2, NEON, AVX-512, RISC-V, LoongArch64... But we do not currently support accelerated processing for POWER processors. If someone is interested in helping out... github.com/simdutf/simd...

Checking whether an ARM NEON register is zero lemire.me/blog/2025/01...

JavaScript hashing speed comparison: MD5 versus SHA-256 lemire.me/blog/2025/01...

We are looking into building a benchmark for URLPattern. If you use URLPattern in the real world, what is your usage pattern?

Counting the digits of 64-bit integers lemire.me/blog/2025/01...

It appears that prior to release 14, Apple LLVM did not really support C++20 concepts although it claims that it does. The following ugly hack might be needed: #if defined(__APPLE__) && defined(__clang__) #if __clang_major__ <= 13 //... #endif #endif @yagiznizipli.com

The latest release of the simdutf C++ library (6.0.0) brings in more convenience for C++20 users. While you used to have to provide both a pointer and a size parameter... often you can now just pass your container...

Special mention to @peterdimov.bsky.social and @guidovranken.bsky.social for spotting the problem:

In the ada C++ library, we are currently stuck with a GCC puzzle. We cannot seem to return an std::vector from a function without stack buffer overflow. Source: github.com/ada-url/ada/... We have been at this for over a week.

How does your URL parser handle Unicode? lemire.me/blog/2025/01...

Indexing the bluesky social network with Roaring Bitmaps. jazco.dev/2024/04/20/r...

Efficient In-Place UTF-16 Unicode Correction with ARM NEON lemire.me/blog/2024/12...

Simpler and faster parsing code with std::views::split lemire.me/blog/2024/12...

Accessing the attributes of a struct in C++ as array elements? lemire.me/blog/2024/12...

Our paper, "Parsing millions of URLs per second", written with @lemire.bsky.social became one of the most read articles in Journal of Software: Practice and Experience. onlinelibrary.wiley.com/journal/1097...

Graduate degrees are overrated Though I have many brilliant graduate students, I love working with undergraduate students. And I am not at all sure that you should favor people with graduate degrees, given a choice. lemire.me/blog/2024/11...

How fast can you validate UTF-8 strings in JavaScript? When you recover textual content from the disk or from the network, you may expect it to be a Unicode string in UTF-8. How might you validate a UTF-8 string in a JavaScript runtime? lemire.me/blog/2023/12/0…

Reminder: The novel Rainbows End, written two decades ago, predicted that we would see AGIs emerge in 2025. lemire.me/blog/2015/09...

“Fifteen government departments have been monitoring the social media activity of potential critics and compiling “secret files” in order to block them from speaking at public events” www.theguardian.com/politics/202...

Land use versus cereal production.

How is electricity produced?

"We found that the incidence of self-reported COVID-19 was 33% higher in those wearing face masks often or sometimes, and 40% higher in those wearing face masks almost always or always, compared to participants who reported wearing face masks never or almost never." www.cambridge.org/core/service...

Burn acreage in the US

"China consumes as much cement every two years as the U.S. did over the entire 20th century." www.sustainabilitybynumbers.com/p/china-us-c...