Search

Introducing the Data Science Stack: Establish a machine learning environment using 3 prompts on Ubuntu.

Share it

Unveiling the Data Analytics Toolbox: Create a predictive modeling habitat using 3 instructions on Linux Ubuntu

Canonical, the creator of Ubuntu, has unveiled the Data Analytics Toolbox (DAT), a readily available solution for data analytics that establishes machine learning environments on your artificial intelligence (AI) desktop. This software is completely open-source, free to utilize, and designed specifically for Ubuntu. It is also compatible with other Linux variants, Windows utilizing the Windows Subsystem Linux (WSL), and macOS through Multipass. The DAT is a tool based on a command-line interface, incorporating Jupyter Notebooks, MLflow, and frameworks like PyTorch and TensorFlow built on an orchestration layer. Canonical ensures the security upkeep of all integrated packages in the program, ensuring prompt patching of vulnerabilities and safeguarding both the software and the artifacts produced.

Erect your ML ecosystem with just three orders

Data Science adaptation is widespread, but hurdles in efficaciously executing it are equally prevalent. Deloitte provides insight with the following figures:

  • 40% of organizations are integrating AI
  • 25% of AI professionals encounter obstacles due to package interdependencies
  • 24% of AI professionals face challenges accessing computing resources

Given this landscape, business executives are under pressure to swiftly refine AI capabilities and demonstrate ROI from AI ventures. Reducing the time needed to configure ML environments is vital to hasten project delivery and the preliminary exploration phase of AI within enterprises. This underpins the creation of the Data Analytics Toolbox (DAT).

The DAT can be established with just three directives, facilitating swift initial exploration on AI workstations. Individuals only need to configure their container orchestration layer, install the DAT CLI, and initiate the Data Analytics Toolbox to access the environment. This setup can be accomplished in 10-30 minutes, depending on the user’s proficiency.

Chris Schnabel, Canonical’s Silicon Alliance Ecosystem Manager, explained, “This eradicates the responsibility of managing package dependencies or setting up computing resources, thanks to the uncomplicated orders expressly for AI practitioners. By default, the DAT includes access to Jupyter Notebook for model creation, MLflow for experiment tracking and model registry, as well as ML frameworks such as PyTorch and TensorFlow. However, users can tailor the Data Analytics Toolbox and incorporate new libraries based on their requirements.

Tailored for compatibility with varying hardware configurations

The Data Analytics Toolbox was designed to function on different hardware setups to optimize user experience and enable users to attain peak performance on their preferred hardware. The DAT employs streamlined ML frameworks from various providers like PyTorch and TensorFlow to give users a choice of the most prevalent distributions and achieve peak performance levels. Intel, for instance, contributes to enhancing their hardware optimizations to these communal projects. Nevertheless, to gain early access to performance enhancements and features like Intel GPU support ahead of the mainstream integration, AI practitioners can also leverage ITEX and IPEX, being Intel’s PyTorch and TensorFlow distributions. IPEX and ITEX offer enhanced performance optimization based on the hardware, leveraging Advanced Vector Extensions (AVX), Vector Neural Network Instructions (VNNI), and Advanced Matrix Extensions (AMX). Through integrating these extensions alongside GPU acceleration, the DAT garners accelerated operations crucial in AI contexts, thereby decreasing model training time and expediting the experimentation phase of ML schemes.

Canonical’s Data Analytics Toolbox furnishes an indispensable groundwork for AI professionals to expedite their machine learning and data science abilities,” remarked Arun Gupta, Vice President and General Manager for Open Ecosystem at Intel. “By aligning with mainstream PyTorch and TensorFlow distributions, we assure developers are utilizing progressive tools. Our collaboration via the OPEA project augments this effect, simplifying AI development and democratizing innovation for all.”

AI workstations are a strategic focal point for numerous computer manufacturers. Solutions like the Data Analytics Toolbox empower them to deliver a seamless experience across all devices, enabling them to vary their chosen GPU without impacting user experience.

Receive comprehensive security and assistance from a singular supplier

McKinsey has reported that 51% of firms employing AI cite cybersecurity as their foremost risk to mitigate, with regulatory compliance following closely at 36%. These risks extend across all tiers and scopes of ML development, ranging from AI desktops to data centers or edge devices.

While data scientists set up their ML environments, they often deploy containers and open-source tools from diverse origins, sometimes disregarding the associated security hazards. Within an enterprise setting, security can swiftly transform into a challenge for systems administrators tasked with deploying and maintaining various tools, which may lack uniformity due to the absence of standards. The DAT furnishes a consistent architecture for ML environments that can be implemented at scale on numerous devices.

Ubuntu stands out as the most favored Linux distribution (source: StackOverflow report), with a substantial number of AI/ML professionals opting for it in their undertakings. As business leaders allocate resources for ML projects and professionals commence their preliminary exploration, they will gradually deploy solutions on workstations.

Companies can access security maintenance and support through Ubuntu Pro. Enterprises benefit from corporate backing for their ML solutions on environments, enabling expedited issue resolution in alignment with Canonical’s service level agreements (SLAs).

To learn more about the Data Analytics Toolbox, delve into our webinar. We’ll showcase the features you can leverage and demonstrate the setup via the three instructions.

For additional exploration

  • [webinar] An in-depth look at the Data Analytics Toolbox
  • [blog] Risks linked to machine learning security

🤞 Don’t miss these tips!

🤞 Don’t miss these tips!

Solverwp- WordPress Theme and Plugin