Download and Sign Up
Get a $5 Coupon For Free
Getting Started Main Features

Replace C language! Many Python developers are joining the Rust team | Web Scraping Tool | ScrapeStorm

2024-02-29 13:08:22
6042 views

Abstract:In the future, more and more libraries will use Python as the front end (improving programming efficiency) and Rust as the back end (improving performance). ScrapeStormFree Download

In the future, more and more libraries will use Python as the front end (improving programming efficiency) and Rust as the back end (improving performance).

python

Rust is replacing C as the “backend” for high-performance Python packages. What is the reason behind this?

First, let’s consider motivation. Python is easy to write, but has the problem of slow execution speed. I especially can’t write data processing libraries because Python is very slow and it’s difficult to write high-performance libraries in pure Python. However, Python is the primary language for machine learning and data engineering. So when you try to write a library for data engineers or machine learning engineers, you run into the following problems:

Although we need to write APIs in Python, high-performance data processing tasks cannot be done solely in Python.
This means that you have the following options for writing a library:

Either you learn and use C, or someone else learns C, writes a library, and you rely on that library to be able to perform low-level operations.
Someone familiar with the C language might ask, “Is there anything wrong with this? Many library authors may outsource their numerical computation to NumPy or SciPy. You might think, “I can learn.”

However, the situation is not so ideal. It’s convenient to outsource some tasks to libraries like NumPy, SciPy, etc., but it requires all functions to be vectorized and you can’t write your code in a for loop. You also have to worry about certain operations being blocked by the Global Interpreter Lock (GIL), and there are various other issues. Not everything you want to do can be easily found in a library that already exists.

Therefore, there is another method. What about writing a library from scratch in C and adding the Python bindings afterwards? However, if you have a Python background, writing low-level code using C feels very low-level, and learning the language takes effort. Null pointer dereferences, buffer overflows, memory leaks… these are traps you can encounter when using the C language, and are unfamiliar to programmers learning Python for the first time.

How great would it be if there was a language that was as fast and memory efficient as C, but didn’t require manual memory management or garbage collection? It would be good if the language had good Python tool support and an ever-expanding developer community.

Rust

Rust is fast and has efficient memory management. Therefore, concurrent and parallel programming becomes easier. Rust has great tools, a friendly compiler, and a large and active developer community. Rust makes your programs faster and allows you to make more friends while learning.

Most importantly, Rust is easier to learn than C for Python developers.

It improves the “first level” experience and makes it easier for beginners to write “safe” code. The learning curve is smoother, allowing you to gradually master more advanced language features over time.

Therefore, over the past few years, more and more high-performance libraries are choosing Python as their front end and Rust as their back end. example:

Lance: A high-performance, low-cost vector database.
Founders Chang She and Lei Xu originally wrote code libraries in C++, but decided to switch to Rust when they no longer needed to work with CMake. Here’s why Chang made this decision.

“The decision to switch from C++ to Rust was because I could work more efficiently, without losing performance, and didn’t have to deal with CMake.I basically started learning Rust from scratch, and while I was learning it, Lei and I We rewrote about 4 months worth of C++ code in Rust, and each time we write and release a new feature in Rust, we become more confident that we won’t get a segfault every time we run every other command. I don’t have to worry about it happening.”

Not only is Rust well-suited for data processing, it can also serve as a backend for many other Python packages with high performance requirements.

Pydantic: A Python verification library for developers.
The Pydantic team rewrote the second version in Rust and saw a 20x performance improvement for even simple models. Besides performance improvements, Rust has several other benefits. Pydantic founder Samuel Colvin cites several advantages.

“Another thing about Rust is that in addition to its speed, code written in Rust is usually easier to use and maintain. In particular, Rust catches and handles all possible errors. , and the Python (and TypeScript) type system tends to ignore these errors, so I’m not sure which exceptions can be raised in which situations when calling ‘foobar()’ I have no idea. Basically I have to try iteratively to find possible failures.”

Combining Python and Rust

In the future, more and more libraries will use Python as the front end and Rust as the back end. Overall, Python developers today have a better and smoother approach to building high-performance libraries.

Disclaimer: This article is contributed by our user. Please advise to remove immediately if any infringement caused.

Generate URLs in batches php crawler Match emails with Regex Download images in batches Automatically organize data into excel python crawler Download web page as word python download file Keyword extraction from web content Data scraping with python
关闭