2014년 12월 25일 목요일

Data Today: What people buy where, Twitter-based linguistics study + More

Moving away from proprietary  View in browser > 
O'Reilly Media Logo
O'Reilly DataNewsletter

1. Wouldn't it be fun to build your own Google?

What if you had your own copy of the entire web, and you could do whatever you want with it?

2. The "high-interest credit card of technical debt"

Here's a Google white paper on the problems that machine learning can create.

3. Is SQL and NoSQL design beginning to merge?

Is the divide between SQL and NoSQL shrinking?

4. Moving away from proprietary

Big data is moving into the mainstream, says Steve Lohr, but it's not a get-rich-quick game. Many of the most promising big data companies are supplying open source software and operating as tech support and consultants rather than selling proprietary products.

5. Twitter-based linguistics study

Do you know what "ard" means? ( And if you do, what does that say about you?) Neologisms—new words, abbreviations, and even emoticons—spread quickly via twitter. But linguists studying twitter data say that pre-existing cultural boundariesstill affect how they spread.
Sponsored Content

Automated data curation

Tamr logoUnderstanding relationships and curating a massive variety of silo-ed data manually can take extraordinary time and effort. Learn how to drastically reduce the heavy-lifting when unifying and enriching data from disparate sources with Tamr, Inc., using human-guided machine learning.
Tamr Technical Overview →

6. Hard Core Data Science Day

The agenda is confirmed for the Hard Core Data Science Day at Strata + Hadoop World in San Jose (Feb. 17-20)—and if we do say so ourselves, the line-up is pretty spectacular. Ben Recht and Ben Lorica have put together a program that includes:
  • hardcore data science dayDeep learning: Tara Sainath (speech) and Fei-Fei Li (vision)
  • Machine learning: Michael Jordan (statistical decision theory & big data), Anima Anandkumarv (tensors), Maya Gupta (interpretable & robust models), and John Canny (the new BIDMach toolkit)
  • Applications: David Andrzejewski (Graph Mining techniques for machine data), Eamonn Keogh (mining large-scale time-series), and Chris Re (recent apps of the DeepDive knowledge base framework)
But wait, there's more! John Myles White will explain why data scientists should consider the Julia programming language, and Alyosha Efros will outline recent progress in Visual Data Mining techniques.
Hardcore Data Science Day sells out every year. If you're coming to Strata + Hadoop World, register now to save your spot at Hardcore Data Science Day.

7. What people buy where

Here's an interesting data viz on how conspicuous consumption varies geographically. Watch the cities rearrange themselves by what they buy. See who spends the most on pets, alcohol, shoes, or hairpieces.

8. The ethics of big data in education

This paper's authors say that the "behaviorist model of human nature is at the foundation of every data model. While it is generally reasonable, one should note that it directly contradicts the rational utility maximizer model of human nature used in microeconomics or the habitual perspective of behavioral economics, and has very different implications for interventions."

Bitcoin & The Block Chain Summit

Bitcoin & Block Chain SummitThe brightest minds in bitcoin and the blockchain are lined up to share their expertise and give you a glimpse of the future at the O'Reilly Radar Summit: Bitcoin & the Blockchain (Jan 27 in San Francisco).
They'll help you separate hyperbole from reality and discover the new business models and immense unrecognized opportunities that bitcoin—and the blockchain behind it—creates.
Seating is limited, and the deadline for the guaranteed Best Price is Thursday (Dec 18)—so reserve your spot today.
Find Out More →

9. More data, less waste

This aerobic food digester's machine learning algorithms identify trends and inefficiencies that lead to waste so each one can be fine-tuned to specific customers' usage patterns. Performance adjustments can be made on any connected unit remotely.

10. Freebie of the week

spark coverGet Chapter 1 from Learning Spark Preview Edition, free, compliments of Pentaho. Chapter 1 covers:
  • What is Apache Spark?
  • Benefits of the tightly integrated Spark stack
  • Spark SQL and Shark
  • Spark for data science tasks
  • Spark for data processing applications
Download Learning Spark Chap. 1 →

댓글 없음:

댓글 쓰기