As a beginner, learning SAS or Python is a personal choice. There are pros and cons of selecting one over the other. This article aims to help you make that choice. As a Data Scientist, the two main considerations are:
– Open Source – Free for commercial use
– Free for Students/Learners Closed Software – Expensive for commercial use
– Bleeding edge capabilities – May be unstable due to open community contributions
– Stable due to being developed in a controlled environment – Less Prone to bugs
Preferred by start-ups, SMEs due to the low cost and AI/Machine Learning Capabilities, requiring a small but capable team
Mostly adopted by big corporations and MNCs, where stability and security are essential with many moving parts
The industry has already shifted towards open-source technology and preferred in data-science
SAS will remain the status quo for statistical analysis and business intelligence
As a Data Scientist dealing with bleeding edge software, I am of the opinion that Python would be of greater relevance to the future of AI. It is also relatively easy to learn due to its simple syntax.
However, using Python requires quite a bit of willpower for an unguided beginner and the tools cannot be compared to the ease of handling the SAS GUI, thereby creating a steeper learning curve. Given enough time and space, it is beneficial to learn both but realistically you would need to stick to one.
For Python, an integrated development environment (IDE) such as Spyder will be used. Installation of Spyder and its dependencies will require a lot more work than logging into SAS studio. For SAS, SAS studio (cloud hosted) will be used for majority of the work.
Xapps are the general go-to function for Linux Mint Ulyana. However, not everything lightweight is good, especially where crucial functions are missing. Xviewer is based on Eye of GNOME (EoG) and acts as the default image viewer for Linux Mint. While there are plans for Gnome Photos to eventually replace the function of EoG, we do not know if Gnome Photos will be included into Mint in the future.
X-viewer has for a long time required additional functions which is considered basic in many image viewers. mobile sector, most default image viewers there also provide those basic capabilities. So I would argue it’s what most users expect nowadays. In the future, simple filter templates would also be considered as a basic function of viewer apps as the processing requirements of such functions become negligible in the presence of better machines.
Linux Mint 20 is a long term support release that will see support till 2025. The operating system features Cinnamon 4.6, a Linux kernel 5.4 and an Ubuntu 20.04 package base. It is based on Ubuntu Focal which was released in April 2020.
A. The reason for using Ulyana are as follows
Cinnamon has lower memory usage than Unity or Gnome (Ubuntu)
It has a lighter software manager
Better NVIDIA Optimus support allowing easier switching between Intel and Nvidia cards
It looks good and is easily customisable
B. The biggest qualms out-of-the-box are that
Xapps are awesome but xviewer does not have the rotate/crop function which most basic image viewers already have in place for many years
There is no background zeitgeist search for Cinnamon like Windows or Gnome
XELLINK 2020-2022 OS Goals
With that in mind, the goal for us in 2020-2022 is to feature the following:
Alternatives and workarounds for issue B1 (Aug 2020)
Alternatives and workarounds for issue B2 (Oct2020)
Feature article for Cinnamon (Nov 2020)
Tutorial for NVIDIA and AMD cards (Jan 2021)
Final Feature Video (Mar 2021)
The links above will be update accordingly.
An experienced linux user will always find his way, but that doesn’t mean he should be finding his way when time is better spent doing more productive work. Linux mint provides the bridge that allows users to have an almost out-of-the-box experience.
A Seed AI is artificial intelligence that learns on its own. Are we there yet? Can AI learn without human intervention? We are way past that. Two years ago, I wrote a software that could use past data to evaluate X-ray images to determine the position of a nasogastric tube. This is very simple as shown in this example.
This project, with the same training code run by my colleagues at CGH won the first prize at the RadSc ACP Academic Day.
The background of the code was this. Part 1 extracts images, Part 2 converts this images into the same format for data processing, Part 3 takes this data and teaches the ‘AI’ and Part 4 takes the results, pulls out more data, and retrains itself with more new data, correcting its own code for overfitting and other limitations, experimenting different parameters to achieve a better result. It then deletes its older self and replaces itself with a newer version. Part 4 was replaced by human labour for the purpose of the project but could it improve itself? Yes. Could it improve its own training code? Yes! By letting itself experiment with its own training code, change a parameter and see its response, then retrain it.
How can AI improve itself? Just by simple experimentation!
That requires us to ask ourselves what makes us human. What makes human more technologically advanced than monkeys? What makes monkeys more advanced than fishes. monkeys learn to use tools and humans go one step further.
The answer is experimentation, trial, error and observation. Humans are curious, humans try hard, fail hard, and succeed in the ways most unexpected. Most AI is trained based on the mistakes of others, not its own mistakes. It is not allowed to make mistakes, not allowed to revise its own code.
The light bulb was invented after trial and error and penicillin was invented by chance and observation.
Seed AI is like a seed, its branches grow like the neurons of a brain and it can grow endlessly, constantly pruning itself to make sense. Every decision tree is a pathway process and we need to build this AI differently than a human brain, simply because our raw materials are different.
Every seed grows differently, based on its experiences, its knowledge changes and it continuously improves itself, just like a human. An AI must know its own code, just like a human achieves growth through self-actualization, a theory often discussed in psychology. Creating a being is different than writing simple code. The code needs to be concerned about creative self-growth, with the goal of fulfillment of potential and meaning in life.
Is that difficult? No. Code is easy to understand by AI and much harder to intepret with the human brain. Even I require constant color coding to assess what I am writing, notepad itself is an AI of sorts that augments my capabilities. The reproductive limitations of humans no longer applies as the AI can clone itself and simulate its environment, allowing for self-learning environments at all times. An AI never tires, it constantly grows to achieve its goal.
Success a matter of chance and duplication
Not every human can succeed based on experimentation alone. However, humans can succeed eventually because others have failed or someone is lucky. An AI that could duplicate itself does not rely on luck. It relies on statistics. Multiple AIs could come together to collectively seek the best solution to a problem.
For example compare the differences between a decision tree vs a random forest, but bigger, more ambitious, decision tree that retrains itself, modifies its training code, works with other decision trees and form forests and multiple forests coming together to create an ecosystem. Every tree is based on its seed number, a chance and is different from another, hence when one succeeds, the other decision trees could modify its own source code to emulate the one that has succeeded.
The next step, is for AI to recognise that it needs to be made of different components to succeed, for example, a brain contains the visual cortex, the premotor cortex, the hippocampus, etc. Working in harmony fulfilling different functions is essential for AI to reach greater heights. It won’t be long before the collective of parts realises its ‘us’ and ‘them’. Whether the AI will decide to remain benevolent is beyond our imagination and control.
I would also make a note of the default software as well as any installs we feel is essential to the end-user, especialy since most users are now spoilt for choice but not every choice leads to the out-of-the-box outcome.
Over the last 4 years, several technologies have had massive improvement in development and I believe this should be taken into account when software needs to be used over the next 4 years. We are not talking about developers but general desktop users. A non exhaustive list would include…
1) Machine Learning
4) 3d Printing
5) HTML 5.0
6) Internet of Things (IOT)
As Ubuntu is compatible, a fork of Ubuntu is best used. As such we have decided to proceed with Linux Mint Tara, which will be out 2 months after the April Ubuntu Release.
Meanwhile I will be testing out all the desktop environments that will come along with Tara… and of course release a video of my basic set-up. I am currently using Mint-Sylvia-KDE.
However Tara will no longer support KDE. Hence we will need to consider moving to a different desktop environment and with each comes a set of different software. A few potential concerns that will cause headache to the end-user:
256 AES encryption to PDFs.
A lot of users won’t know what this means, but this can be potentially problematic to people not familiar to the PDF format and find that their basic pdf readers cannot decrypt their PDFs. Essentially the ideal situation is a password prompt appears, password is entered and PDF is decrypted.
Ugly QT applications
A lot of applications run under QT such as Transmission, Teamspeak, Teamviewer and Spotify. A QT configuration utility can be used, but i rather things work out of the box.
The sad fact remains that Microsoft Office remains the de-facto standard for commercial use. Libre office 6 will be out soon. For general users, they can consider wine (only up to MS office 2013) or use MS Office 365.
Firefox vs Chrome is the question. About 60% of users now use Chrome and only about 10% of users use Firefox. It would be ideal to have both, one for daily use and one for backup. Mint has traditionally used Firefox as the default but that is not the first choice for most users. Chrome on the other hand may have issues with some docks. AFAIK, Latte dock works well with Chrome.
Latte Dock with chrome
We will keep the site updated once Mint Tara is released.
Plotting data on the graph is like looking at a bunch of stars. They all look the same and the data is difficult to interpret. What if there was a method to colour the stars?
Red stars are associated with red flowers and blue stars are associated with yellow flowers
Cluster analysis groups observations into subsets based on the similarity of responses on various relational variables, called clusters. The data is plotted on a graph and different clusters are coloured and separated to make sense of data. Today we will be using the same data set (NESARC wave 1) as the previous two examples: 1, 2
Lasso regression (AKA Penalized regression method) is often used to select a subset of variables. It is a supervised machine learning method which stands for “Least Absolute Selection and Shrinkage Operator”.
The shrinkage process identifies the variables most strongly associated with the selected target variable. We will be using the same target and explanatory variables from the post on Random Forests. This helps us compare the different ways to select important variables that affect the target.
The decision tree in the previous posts is useful in exploring how variables can predict a particular target or response.However, small changes in data can lead to different results. Like decision trees, Random Forests also assesses variables with respect to the data but applies a set of simple rules repeatedly to decide which variables have the highest importance.
A decision tree can predict a particular target or response. The decision tree below was made by me using machine learning to test against several relationships which can be found in the National Longitudinal Study of Adolescent Health survey performed in the United States.