Skip to main content

Why We Chose Python for Analytics

Why We Chose Python for Analytics

S3
In this blog post, we will talk about why we decided to use Python for analytics in the Cloud Health Analytics Group at Seagate Technology.
For context, we are a group of engineers with widely different programming backgrounds. Some of us have been using tools like C++ and Fortran to solve complex generalized matrix equations and others have been using Matlab and Mathematica for fast analysis and basic modeling of collected data sets. Our group currently focuses on scalable monitoring software. As part of our effort, we work on three major components: telemetry, analytics and visualization. In a recent blog post, we detailed how we visualize the collected data. In this blog post, we will focus on the analytics we have to do or, more precisely, why we decided to use Python as our tool. It really boils down to three S’s: Speed, Support and Scope.
Speed:
By speed, we mean the speed with which new features can be developed. Python is a high-level language, which means it has a number of benefits that accelerate our code development. The first benefit of the high-level character of Python is the fact that it makes prototyping ideas and code fast. Another benefit is that Python is relatively easy to learn. That has been ideal for a group like ours where people – as with many data science groups – come from different programming backgrounds. Finally, but most importantly, there is a great transparency between code and execution. This transparency eases both maintenance of the code (rewriting, finding bugs etc.) and the process of adding to the code base in our multi-user development environment.
Support:
Python is widely used for scientific computing in both academia and industry. As a consequence, a large number of useful analytics libraries are available (and well tested!), including packages for numerical computing, data analysis, statistical analysis, visualization, and machine learning. All you really need to do in order to get going on a topic is to Google ‘Python + [your analytics approach/tool]’ and soon after you can be testing code that does the analytics you desired and have vast amounts of documentation and examples at your hand to guide you.
Scope:
Python supports object oriented programming and advanced data structures such as lists, tuples, sets, dictionaries and so on. Also matrix operations can be used with the numpy library and the package pandas supports data frames. Having these abilities within the Python scope helps simplify and speed up data operations.
Another really important aspect of Python is the fact that Python is freely available and that a piece of code developed on one platform is portable to other platforms. Python runs under both Windows and Linux environments.
Principles we use for our Python programming
To conclude, let us mention a few concepts we use to guide our coding style – also referred to as The Zen of Python:
    “Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    …
    Readability counts…” 
In short, we use a coding style which enhances readability and maintainability.

Authors: Christian B. MadsenEstelle Cormier, and Javier Von Stecher
http://gengwg.blogspot.com/

Comments

Popular posts from this blog

CKA Simulator Kubernetes 1.22

  https://killer.sh Pre Setup Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these: alias k = kubectl                         # will already be pre-configured export do = "--dry-run=client -o yaml"     # k get pod x $do export now = "--force --grace-period 0"   # k delete pod x $now Vim To make vim use 2 spaces for a tab edit ~/.vimrc to contain: set tabstop=2 set expandtab set shiftwidth=2 More setup suggestions are in the tips section .     Question 1 | Contexts Task weight: 1%   You have access to multiple clusters from your main terminal through kubectl contexts. Write all those context names into /opt/course/1/contexts . Next write a command to display the current context into /opt/course/1/context_default_kubectl.sh , the command should use kubectl . Finally write a second command doing the same thing into ...

OWASP Top 10 Threats and Mitigations Exam - Single Select

Last updated 4 Aug 11 Course Title: OWASP Top 10 Threats and Mitigation Exam Questions - Single Select 1) Which of the following consequences is most likely to occur due to an injection attack? Spoofing Cross-site request forgery Denial of service   Correct Insecure direct object references 2) Your application is created using a language that does not support a clear distinction between code and data. Which vulnerability is most likely to occur in your application? Injection   Correct Insecure direct object references Failure to restrict URL access Insufficient transport layer protection 3) Which of the following scenarios is most likely to cause an injection attack? Unvalidated input is embedded in an instruction stream.   Correct Unvalidated input can be distinguished from valid instructions. A Web application does not validate a client’s access to a resource. A Web action performs an operation on behalf of the user without checkin...