Ever wondered if you should learn a programming language? If so, stick around.
In this article, my goal is to present a high-level view of two languages I have found useful in my forestry career for automating repetitive tasks — R and Python. My hope is that someone may have a similar desire to learn one, or both. Even for a beginner, the basics can be learned in just a few hours. There are many online tutorials available for learning a software language. I have provided a few of my favorite resources at the end of the article.
R and Python are common programming languages used for data wrangling, analysis, and presentation. Both are supported on multiple operating systems, including Windows, Linux and MacOS. Because of their open-source availability, it’s common to find these tools used across many disciplines, from Finance to Space exploration. But, what makes them useful in the forest industry, is something I’d like to explore further.
Each language has an organizing body responsible for maintaining the software and providing version updates. According to the R-Foundation, R is an “integrated suite of software facilities for data manipulation, calculation and graphical display.” Python.org describes Python as an “interpreted, interactive, object-oriented programming language.” Next, let’s dive into the technical weeds a little bit to point out some of the key differences.
R is suitable for efficiently cleaning, manipulating and presenting data. Anyone who needs to work with large amounts of data can benefit from using R. For example, Biometricians, Statisticians, University students, academics, et cetera. R is not a general-purpose programming language and thus is not intended for creating traditional software products. It is primarily used for data analysis and building statistical models that are incorporated into other downstream products. Forestry professionals working with financial data, environmental data, or biometric data may find R to be exactly what they need to process large datasets, or to evaluate complex models.
R has an extensive list of libraries (e.g., packages) capable of advanced statistical analyses and graphical display. The integrated development environment (IDE) most commonly used when writing R code is called RStudio.
Like R, Python is also a good tool for cleaning, manipulating, and presenting data, but its usefulness extends much beyond that. Python is capable of creating full-featured Desktop, Mobile, and Web applications.
There are plenty of IDE’s available for writing Python code, including IDLE that installs with Python. Python syntax is known for its readability which makes it easier for beginners to learn. Python also has an extensive list of packages, with well over 314,000 hosted on PyPi to date.
The general-purpose nature of Python makes it suitable for nearly any software or automation purpose one can think of. Foresters may want to automate repetitive tasks, or aggregate data from a directory or collection of files using Python. I have found Python handy for many tasks, including to process data from GPS units, Excel spreadsheets, and to connect to online weather APIs. (See previous articles “External Data: Acquired data using APIs” and “Convert a GPS Garmin Device into a Sample Plot Collector“)
Another benefit of Python is that it is well-accepted by the Geospatial community. Professional GIS software systems such as ArcGIS and QGIS can be extended using custom Python code. ESRI even provides an ArcPy module for extending ArcGIS products. In fact, this is the reason I began learning Python in 2014 to automate GIS tasks that I was working on at the time. R also has a small Geospatial following, but is less common and is more limited in functionality than Python.
The list below provides links to some online resources for learning Python and R. Some of these sites require a subscription to use. Still, some provide free courses and tutorials as well. Personally, I like the services that offer coding exercises and challenges, along with the tutorials. For me, these resources are the best way to learn and keep a record of progress over time.
Anyone with time to invest in learning both languages will obviously have a more rounded toolset for communicating with a broader audience. However, if time or resources do not allow this, my recommendation would be to learn Python. But then again, this is a personal bias.
This was a brief introduction to R and Python, focusing on how each might be used in the Forest industry. If you have experience with one of these languages and would like to provide your thoughts, please include a comment below.
Python.org. General Python FAQs – what is Python?. Retrieved from https://docs.python.org/3/faq/general.html#what-is-python
R-Foundation. What is R? – The R environment. Retrieved from https://www.r-project.org/about.html