What is the maximum number of lines of code in a Python project? I guess you don't know

Always see someone say that the dynamic momentary reconstruction of the crematorium.

However, there are some well-known open source projects in the world, and some well-known websites with huge traffic like Github and Instagram are based on dynamic language development. After so many years of reconstruction, they have not heard which authors have entered the crematorium. I do not understand these. Is it true that people do not know or pretend to be invisible? However, they say that dynamic languages ​​can't be maintained to a certain degree. Although these words are also not worth refuting, they also reminded me that I am also very curious about the scale of projects that can be developed with dynamic languages.

From the information I know, the largest project developed with dynamic languages ​​may be considered OpenStack, and it is said that the total amount of code has reached millions of lines and is still increasing. This is, of course, a good example of a dynamic language ability. However, such a huge project is not easy to analyze (well, the real reason is I'm too lazy to download a large code base). I chose to analyze some of the more well-known projects in the Python community, mainly from Github, and some from other warehouses. This choice may contain some subjective factors, but I believe most of the projects are still very representative.

The tool for calculating the amount of code is cloc. All projects select the trunk code as of January 3, 2018. Statistics only include Python files, excluding other file types. It is worthwhile to point out that there is a problem in the statistical part of the project through the default installation of the CLO version 1.60 of Ubuntu APT. This problem has been resolved in the latest version. Therefore, all the statistics in this article use the CLOC v1.72 downloaded from the official website. .

What is the Python project with the most lines of code I guess you don't know

The above table has been sorted by the number of lines of code. It's interesting to note that the top four of the largest code scales, except CPython, are all operational and maintenance projects. I guess the code should have more projects such as Odoo rankings rather backwards. I have limited understanding of O&M projects. I am not sure why the code size of these projects will come out top, perhaps because the content to be supported is more and more complex.

What is the Python project with the most lines of code I guess you don't know

The Sentry, which has the largest amount of pure Python code in this statistics, has reached almost 70W. This is a fairly large-scale project. There are three projects with 30W~50W lines of code, including the basic project CPython. There are three separate code sizes for 20W and 10W lines, and the remaining 7 are within 10W lines. Having read this list, you should be convinced that dynamic languages ​​are at least as good as projects with tens of W lines of code. This is the upper limit of most common applications. If the code really reaches the scale of millions of lines, then regardless of the language, it is bound to face the problem of splitting the project.

The above table categorizes code quantity indicators by code/blank/comments, which also reflects the code style of the project to some extent. Sentry is the most heavily coded project in this statistics, but as you can see from the table, the comments in the project are a bit disproportionate compared to other projects, indicating that Sentry's authors are very unfocused.

The students must have discovered that I added a few other things in addition to the code-related indicators in the list, which is also my personal interest.

What is the Python project with the most lines of code I guess you don't know

The first indicator is the average number of lines per file. From the viewpoint of modularity, it is obviously unreasonable to pile too much content in a single file, which usually means too much Coupling, difficulty in understanding and modification. However, there is no clear standard for how much it is appropriate. I hope that through the analysis of these projects, I can understand what choices open source writers have made in practice.

The distribution of statistical results is relatively even, from 100 to 600 lines/files, and there is no obvious concentration point. Interestingly, the first two (Pandas, NumPy) are closely related and are all related to mathematical statistics. This may be because the characteristics of the math library are relatively pure and single, not as easy as other libraries. The last item (Pillow, youtube-dl, Odoo, Scrapy) can confirm this conjecture from the side: they are all oriented to a specific area, so it is easier to modularize.

What is the Python project with the most lines of code I guess you don't know

The second indicator is the ratio of comments and code, and this problem has a similar situation. The comments are not as detailed as possible, but they always require a certain amount of comments to explain Why's problem. Too few comments indicate that the author of the project did not leave sufficient clues for the maintenance staff later on, which may cause maintenance problems. On the other hand, all of our inspections are open source projects, and there are no company assessments or KPI constraints, so we can rest assured that there will be no problem of authors deliberately writing notes. The aforementioned Sentry is uncontroversial because too few comments came to the end. This does not necessarily mean that the project is very poor, but it is at least a signal that the project may be problematic in terms of maintenance. For those projects (Ansible, NumPy, Fabric, Salt, etc.) that the authors are willing to devote their energies to writing notes, it is enough to reflect the author's considerable investment in the project. This is a good signal that these projects are trustworthy. .

One thing that is beyond my expectation is that CPython, which is the mother of all projects, ranks later. According to the principle, this foundation project should have more annotations. But think again and feel understandable, because CPython has separate, very detailed documentation, which most other projects don't have, and less commentary in the code is excusable.

What is the Python project with the most lines of code I guess you don't know

The last statistic is about the file type. The vast majority of Python projects should be Python code, which is not a problem, but at the same time I also want to see which major files a project includes in addition to the Python code. The list of C/HTML/Javascipt is not unexpected, but there is a kind of file that I did not think of in advance. It is .PO (the language resource file commonly used in open source projects). For both Django and Django-CMS projects, there are even more PO codes than Python code.

Looking at it, Django supports more than 90 languages, and it's no wonder that there are so many language files. This result can also remind us that some students - not only programmers, but also most inexperienced bosses, customers, product managers, etc. - will subconsciously think that program development is nothing more than writing code, other than code work At the time of estimation, it often takes only a brain-like setting for a very short time. But for the actual project, the code is just a part of it. "Other jobs" sometimes - and it should be said often - will take up most of your time and energy.

These tasks are often not fun, but they are also an integral part of the project. I hope that the students will pay enough attention to them.

Incremental linear encoders

Incremental Encoder is commonly used, and Absolute Encoder is used if there are strict requirements on position and zero position. Servo system should be analyzed in detail, depending on the application situation. Commonly used incremental encoder for speed measurement, which can be used for infinite accumulation measurement; Absolute encoder is used for position measurement, and the position is unique (single or multiple turns). Finally, it depends on the application situation and the purpose and requirements to be realized.

Incremental Linear Encoders,Linear Optical Encoder,Linear Position Encoder,Encoder Bearing Tester

Yuheng Optics Co., Ltd.(Changchun) , https://www.yuhengcoder.com