2015년 2월 18일 수요일

Data Today: Live From Strata + Hadoop World

A million rows isn't cool. A billion rows--that's cool.  
O'Reilly Media Logo
O'Reilly DataNewsletter

1. Live from Strata + Hadoop World

Strata + Hadoop World has sold out again. If you're here in San Jose with us, it's great to see you. A little advice:
  • Overwhelmed by session choices? Our friends at Dato have created a Strata Session Recommender that will help you narrow down the choices based on the session(s) you know you can't miss. Check it out.
  • Wear comfortable shoes, hydrate, and talk to the people around you. The "Hallway Track" is an incredible experience—you'll meet some of the most interesting people in data here.
If you aren't here, you're missing a good time, but there are a couple ways you can "ride along":

2. A million rows isn't cool. You know what's cool?

A billion rows. John Russell explains how to change your frame of reference when starting with SQL on Hadoop.

3. Cyber Threat Intelligence Center

"Currently, no single government entity is responsible for producing coordinated cyber threat assessments and sharing the information rapidly," says Lisa Monaco, President Obama’s homeland security and counterterrorism adviser. That will change. The White House announced new a Cyber Threat Intelligence Center to aggregate cyber intelligence data to "fill the gaps."

4. Drawing heatmaps in R

Digithead offers a tutorial on drawing heatmaps in R.

5. Social network analysis

Ben Lorica provides an understanding of network structure and dynamics in online social systems, including information cascades, viral content, and significant relationships.
Sponsored Content

Self-service data prep

Paxata logoLooking for a solution that puts data prep in the hands of analysts AND give the enterprise a platform for collaboration and governance? Look no further: Paxata Adaptive Data Preparation.™ Designed for business analysts, built on Hadoop™ and powered by Apache Spark.™
Meet Paxata at Strata + Hadoop World and see how a modern data prep solution can drive greater value from your big data strategy.

6. Here are 3 ways to lie with a data viz

Three of the most common ways in which visualizations can be misleading.

7. Feature learning & deep learning

Stanford offers an unsupervised feature learning and deep learning tutorial. (It includes code and exercises.)

8. In the community

Ben Lorica will host a panel on "The Lambda Architecture" at the Hive Big Data Think Tank meetup in Sunnyvale CA, on Feb 25 from 6 pm to 9 pm. If you're in the area, make plans to attend. The panel includes: Eric Baldeschwieler, Jai Ranganathan, Ron Bodkin, Jay Kreps, and Patrick Wendell.

9. Where's Waldo?

WaldoWant to find him? Start with kernel density estimates and progress through search path optimization to find Waldo. (Hint: If Waldo isn’t on the bottom half of the left page, then he’s probably not on the left page at all.)

10. Freebie of the week

Real-World Active LearningMachine learning isn't a set-it-and-forget-it operation. Even with solid examples, ML algorithms can still fail and end up blocking important emails, filtering out useful content, and causing a variety of other problems. A new free report by industry analyst Ted Cuzzillo, Real World Active Learningexamines real-world examples of active learning, and how (and where) to insert human judgment to actively improve the algorithm's performance. Thanks to Crowdflower for sponsoring this week's report.
Get the Real World Active Learning report free  →

댓글 없음:

댓글 쓰기