libsvm Chih-Jen Lin Some Thoughts on Large-scale Data Classi

Genie · 发表于 2012-10-24 15:02:44

今天有幸听了Chih-Jen Lin的一场报告，主要关于大数据分类，指出了人们对分类任务认识的一些误区，以及分布式环境(MPI/mapreduce)做分类，未来大数据分类的发展方向，推荐阅读下

twinsken · 发表于 2012-11-26 00:35:04

大规模机器学习算法的开发非常依赖于并行的计算框架，hadoop当初并不是为了机器学习设计的，有人希望能解耦合，能适应多种下层的计算模式

Genie · 发表于 2012-11-23 14:51:44

请问是啥问题争论了好久？？Spark已有公司在用，国内公司永远比别人慢一拍

twinsken · 发表于 2012-11-22 12:10:33

mahout社区曾经就这个问题争论了好久，最后依旧坚持用hadoop hdfs，因为业界广泛应用，spark，puma之类还很不成熟
ps：求下载

Genie · 发表于 2012-10-24 22:50:05

chenwq 发表于 2012-10-24 19:34
"A framework is like a language or a specification. You can then have different implementations"

...

The Hadoop or MapReduce are not designed in particular for machine learning application,and we need know when and where are suitable to be used.Why Hadoop is insufficient for iterative algorithms？It have expensive Disk IO use.

Genie · 发表于 2012-10-24 21:45:42

chenwq 发表于 2012-10-24 19:34
"A framework is like a language or a specification. You can then have different implementations"

...

反复强调的“Focus on ease of use”

chenwq · 发表于 2012-10-24 19:34:59

"A framework is like a language or a specification. You can then have different implementations"

"let problems drive the tools"

cwc · 发表于 2012-10-24 15:51:14

据说Andrew Ng下周要去百度做报告了，羡慕嫉妒恨那

		自动登录	找回密码
密码			注册

libsvm Chih-Jen Lin Some Thoughts on Large-scale Data Classi

本帖子中包含更多资源

浏览过的版块