A week of blogging. Ingredients of basic distribution-analysis with Mahout ? Hadop debugging via simple log service. Missing New York. Reflecting on current state of information security. Jetty-based hadoop log service in 10 minutes. Maven made simple app development as a brick-packing. Trivial codebase mixing with direct pushes to local repo. Unfortunately, adding dependencies to complex maven hierarchy can be challenging. Used to Maven copy-dependencies to lib/ , totally forgot about Hadoop ways and put-it-all-in-task jar paradigm. Oh, well. We're good now.
Betrayed by the data. Creating algorithms that promise data-magic can be quite ungrateful at the payday. Don't commit to results. Get outlook estimates fast. Always keep the pessimistic edge. Checking out multipart MIME in support of http streaming work.
Iterating on Mahout design. Cabernet taste. Simple distribution statistics task in progress. Moving to 2-space indent. Going for hadoop-free mahout tasks. Interesting set of *Job interfaces, not sharing anything but runJob() convention. The chaos of dfs files management. Can't run two processes in parallel. Summoning trouble. DFS might allow concurrent access, but code usually doesn't. Visualizing general tree-like code architecture.
Mahout-157 Frequent Pattern Mining using Parallel FP-Growth. Interesting. Machine Learning in Computational Finance, Victor Boyarshinov PhD thesis. Interesting overview of optimal trading strategies training and optimal separation learning. However, in practice, all of the useful topics actually fall under "Computational Statistics" umbrella.
Time. Ticking away.