What’s new in RapidMiner 7.4?

2017-02-27

Background Execution


This feature is only available for users with a Large license.

Processes can now be executed in the background of Studio while you work on a different process in the user interface.

 

Odoo text and image block

Background Monitor

The Background Monitor displays the current state of the background processes and gives access to the results.

 

Odoo text and image block

Parallel Loops


RapidMiner now features new parallel Loop operators which run multiple iterations at once, making full use of your available CPU cores (up to the limits of your RapidMiner license) which greatly speeds up your processes.

New parallelized Loop operator.
New parallelized Loop Values operator.
New parallelized Loop Attributes operator.
New parallelized Loop Files operator.

Please note that already existing processes will still use these old operators for compatibility reasons! To make use of the new operator within such processes, you need to manually replace the existing operators with the new ones.

 

Order your Repository


The Repository panel now allows to switch between alphanumeric and chronological order.

 

Odoo text and image block

Granting Additional Permissions to Extensions


Users with Large licenses can now grant additional permissions to unsigned extensions. This is done through a toggle in the Start-up section of the Settings dialog. These permissions are not enabled by default, because they increase the security risk by allowing unknown software to be run in your system.

 

Odoo text and image block

Introducing SparkRM

RapidMiner Radoop 7.4 introduces SparkRM (available with the “Enterprise” license). With SparkRM any operator or process existing in RapidMiner Studio can be run in parallel in a Hadoop environment, leveraging Spark as the execution framework.

The user-defined Subprocess (i.e. visually defined code) in the new SparkRM meta-operator can contain any in-memory RapidMiner operator, including those from extensions. The operator encapsulates that subprocess and pushes it to Hadoop, where it is automatically executed inside of Spark on potentially multiple Hadoop nodes. The input data provided to the SparkRM operator is partitioned (according to the values of an attribute, linearly or just randomly) and distributed to the Hadoop nodes beforehand. The RapidMiner subprocess is then run on all those partitions, potentially in many Hadoop nodes. After execution, the result is merged if it’s a coherent dataset, or returned as a collection otherwise.

SparkRM opens up a variety of new use cases that can now be solved by Radoop natively on Hadoop, especially those that need an extension, like text analytics, process mining, time series analytics or forecasting and many more. For a more detailed guide, check the SparkRM: Process Pushdown section in the documentation.
Odoo text and image block
Odoo text and image block

Support for Hadoop user impersonation (“proxy” user)

RapidMiner Radoop 7.4 now also supports Hadoop user impersonation, significantly simplifying Radoop connection setup and management when connecting to a Hadoop cluster using RapidMiner Server. A Radoop connection on RapidMiner Server can be defined using the credentials (password or keytab) of a Hadoop “proxy” super-user. When a RapidMiner Studio user logs in to RapidMiner Server, she is authenticated using her RapidMiner credentials. Once logged in, whenever she runs a Radoop job, the super-user then impersonates the RapidMiner user and the job will have the rights and privileges granted to that same user in Hadoop.

This approach reduces administrative work as a single Radoop connection in RapidMiner Server can be used by multiple users. It is especially useful in multi-user installations.