How to use
optKey
method
in
org.apache.mahout.clustering.canopy.TestCanopyCreation

Best Java code snippets using org.apache.mahout.clustering.canopy.TestCanopyCreation.optKey (Showing top 4 results out of 315)

/**
 * Story: User can produce final point clustering using a Hadoop map/reduce
 * job and a EuclideanDistanceMeasure.
 */
@Test
public void testClusteringEuclideanMR() throws Exception {
 List<VectorWritable> points = getPointsWritable();
 Configuration conf = getConfiguration();
 ClusteringTestUtils.writePointsToFile(points, true, 
   getTestTempFilePath("testdata/file1"), fs, conf);
 ClusteringTestUtils.writePointsToFile(points, true, 
   getTestTempFilePath("testdata/file2"), fs, conf);
 // now run the Job using the run() command. Others can use runJob().
 Path output = getTestTempDirPath("output");
 String[] args = { optKey(DefaultOptionCreator.INPUT_OPTION),
   getTestTempDirPath("testdata").toString(),
   optKey(DefaultOptionCreator.OUTPUT_OPTION), output.toString(),
   optKey(DefaultOptionCreator.DISTANCE_MEASURE_OPTION),
   EuclideanDistanceMeasure.class.getName(),
   optKey(DefaultOptionCreator.T1_OPTION), "3.1",
   optKey(DefaultOptionCreator.T2_OPTION), "2.1",
   optKey(DefaultOptionCreator.CLUSTERING_OPTION),
   optKey(DefaultOptionCreator.OVERWRITE_OPTION) };
 ToolRunner.run(getConfiguration(), new CanopyDriver(), args);
 Path path = new Path(output, "clusteredPoints/part-m-00000");
 long count = HadoopUtil.countRecords(path, conf);
 assertEquals("number of points", points.size(), count);
}

/**
 * Story: User can produce final point clustering using a Hadoop map/reduce
 * job and a EuclideanDistanceMeasure and outlier removal threshold.
 */
@Test
public void testClusteringEuclideanWithOutlierRemovalMR() throws Exception {
 List<VectorWritable> points = getPointsWritable();
 Configuration conf = getConfiguration();
 ClusteringTestUtils.writePointsToFile(points, true, 
   getTestTempFilePath("testdata/file1"), fs, conf);
 ClusteringTestUtils.writePointsToFile(points, true, 
   getTestTempFilePath("testdata/file2"), fs, conf);
 // now run the Job using the run() command. Others can use runJob().
 Path output = getTestTempDirPath("output");
 String[] args = { optKey(DefaultOptionCreator.INPUT_OPTION),
   getTestTempDirPath("testdata").toString(),
   optKey(DefaultOptionCreator.OUTPUT_OPTION), output.toString(),
   optKey(DefaultOptionCreator.DISTANCE_MEASURE_OPTION),
   EuclideanDistanceMeasure.class.getName(),
   optKey(DefaultOptionCreator.T1_OPTION), "3.1",
   optKey(DefaultOptionCreator.T2_OPTION), "2.1",
   optKey(DefaultOptionCreator.OUTLIER_THRESHOLD), "0.7",
   optKey(DefaultOptionCreator.CLUSTERING_OPTION),
   optKey(DefaultOptionCreator.OVERWRITE_OPTION) };
 ToolRunner.run(getConfiguration(), new CanopyDriver(), args);
 Path path = new Path(output, "clusteredPoints/part-m-00000");
 long count = HadoopUtil.countRecords(path, conf);
 int expectedPointsAfterOutlierRemoval = 8;
 assertEquals("number of points", expectedPointsAfterOutlierRemoval, count);
}

String[] args = { optKey(DefaultOptionCreator.INPUT_OPTION),
  getTestTempDirPath("testdata").toString(),
  optKey(DefaultOptionCreator.OUTPUT_OPTION), output.toString(),
  optKey(DefaultOptionCreator.DISTANCE_MEASURE_OPTION),
  EuclideanDistanceMeasure.class.getName(),
  optKey(DefaultOptionCreator.T1_OPTION), "3.1",
  optKey(DefaultOptionCreator.T2_OPTION), "2.1",
  optKey(DefaultOptionCreator.OUTLIER_THRESHOLD), "0.5",
  optKey(DefaultOptionCreator.CLUSTERING_OPTION),
  optKey(DefaultOptionCreator.OVERWRITE_OPTION),
  optKey(DefaultOptionCreator.METHOD_OPTION),
  DefaultOptionCreator.SEQUENTIAL_METHOD };
ToolRunner.run(config, new CanopyDriver(), args);

String[] args = { optKey(DefaultOptionCreator.INPUT_OPTION),
  getTestTempDirPath("testdata").toString(),
  optKey(DefaultOptionCreator.OUTPUT_OPTION), output.toString(),
  optKey(DefaultOptionCreator.DISTANCE_MEASURE_OPTION),
  EuclideanDistanceMeasure.class.getName(),
  optKey(DefaultOptionCreator.T1_OPTION), "3.1",
  optKey(DefaultOptionCreator.T2_OPTION), "2.1",
  optKey(DefaultOptionCreator.CLUSTERING_OPTION),
  optKey(DefaultOptionCreator.OVERWRITE_OPTION),
  optKey(DefaultOptionCreator.METHOD_OPTION),
  DefaultOptionCreator.SEQUENTIAL_METHOD };
ToolRunner.run(config, new CanopyDriver(), args);

Popular methods of TestCanopyCreation

assertEquals
assertFalse
assertTrue
findAndRemove
getConfiguration
getPoints
getPointsWritable
getTestTempDirPath
getTestTempFilePath
printCanopies
Print the canopies to the transcript

Popular in Java

Reading from database using SQL prepared statement
addToBackStack (FragmentTransaction)
getResourceAsStream (ClassLoader)
findViewById (Activity)
ObjectMapper (com.fasterxml.jackson.databind)
ObjectMapper provides functionality for reading and writing JSON, either to and from basic POJOs (Pl
SocketException (java.net)
This SocketException may be thrown during socket creation or setting options, and is the superclass
HashSet (java.util)
HashSet is an implementation of a Set. All optional operations (adding and removing) are supported.
LinkedList (java.util)
Doubly-linked list implementation of the List and Dequeinterfaces. Implements all optional list oper
BorderLayout (java.awt)
A border layout lays out a container, arranging and resizing its components to fit in five regions:
GridBagLayout (java.awt)
The GridBagLayout class is a flexible layout manager that aligns components vertically and horizonta
Best plugins for Eclipse

How to use optKeymethodin org.apache.mahout.clustering.canopy.TestCanopyCreation

Best Java code snippets using org.apache.mahout.clustering.canopy.TestCanopyCreation.optKey (Showing top 4 results out of 315)

How to use
optKey
method
in
org.apache.mahout.clustering.canopy.TestCanopyCreation