好玩吧
#13 楼 @luikore 刚才用 Heroku 上的 Ruby2.0 跑了一下。。结果上面的应用运行不起来了~ http://blog.heroku.com/archives/2012/11/5/ruby-2-preview-on-heroku/
#4 楼 @quakewang 不是。哦,好像和高亮一样..
#5 楼 @bhuztez 没说比 Trie 快,貌似提到正则更灵活,并且也不慢。 还有即使用 Perl 实现的 trie 和正则从算法复杂度是一样,正则引擎是 c 实现的,速度应该也会有差异。 所以我才想用实际数据测试一下:-)
#2 楼 @luikore 我是看 http://book.douban.com/subject/6758780/ 这本书里讲他们用来处理关键字链接,就是有一组关键词对全文做 autolink,大概是blog.body.gsub(/(key1|key2...)/, "<a href='...'>\\1</a>")
场景应该是关键词短,但数量多,文章长。
据说他们从Regexp union
-> Aho-Corasick
算法的Trie
树-> Perl 的Regexp::List
大量网址 find 和 uniq 用CRC32
多好啊
都不看文档哇....... http://ruby-doc.org/core-1.9.3/Dir.html#method-c-glob
glob( pattern, [flags] ) {| filename | block }
→ nilReturns the filenames found by expanding pattern which is an Array of the patterns or the pattern String, either as an array or as parameters to the block. Note that this pattern is not a regexp (it’s closer to a shell glob). See File::fnmatch for the meaning of the flags parameter. Note that case sensitivity depends on your system (so File::FNM_CASEFOLD is ignored), as does the order in which the results are returned.
Matches any file. Can be restricted by other values in the glob. * will match all files; c* will match all files beginning with c; c will match all files ending with c; and *c will match all files that have c in them (including at the beginning or end). Equivalent to / .* /x in regexp. Note, this will not match Unix-like hidden files (dotfiles). In order to include those in the match results, you must use something like “{,.}”.
Matches directories recursively.
Matches any one character. Equivalent to /.{1}/ in regexp.
Matches any one character in set. Behaves exactly like character sets in Regexp, including set negation ([^a-z]).
Matches either literal p or literal q. Matching literals may be more than one character in length. More than two literals may be specified. Equivalent to pattern alternation in regexp.
Escapes the next metacharacter. Note that this means you cannot use backslash in windows as part of a glob, i.e. Dir will not work use Dir instead
Dir["config.?"] #=> ["config.h"]
Dir.glob("config.?") #=> ["config.h"]
Dir.glob("*.[a-z][a-z]") #=> ["main.rb"]
Dir.glob("*.[^r]*") #=> ["config.h"]
Dir.glob("*.{rb,h}") #=> ["main.rb", "config.h"]
Dir.glob("*") #=> ["config.h", "main.rb"]
Dir.glob("*", File::FNM_DOTMATCH) #=> [".", "..", "config.h", "main.rb"]
rbfiles = File.join("**", "*.rb")
Dir.glob(rbfiles) #=> ["main.rb",
# "lib/song.rb",
# "lib/song/karaoke.rb"]
libdirs = File.join("**", "lib")
Dir.glob(libdirs) #=> ["lib"]
librbfiles = File.join("**", "lib", "**", "*.rb")
Dir.glob(librbfiles) #=> ["lib/song.rb",
# "lib/song/karaoke.rb"]
librbfiles = File.join("**", "lib", "*.rb")
Dir.glob(librbfiles) #=> ["lib/song.rb"]
现在怎么都流行视频了?文字版多好啊 啊啊
android --just a seo test...
有三种谎言:谎言,该死的谎言和统计
好吧 那就是我脑补的...
路过 楼主的头像表情像是坐在马桶上...........
#19 楼 @quakewang referer 似乎不是问题,像 pjax 那样手动触发 ga 接口就可以:https://github.com/defunkt/jquery-pjax/blob/master/jquery.pjax.js#L257
turbolink 可以在 page:change 事件里加,刚才查了一下 ga 也提供重载 referer 的接口:
_gaq.push(['_setReferrerOverride', "referer_url"]);
_gaq.push(['_trackPageview']);
#19 楼 @quakewang 还真是的,是不是所有 ajax 请求都存在 referer 不正确这个问题?
#14 楼 @xds2000 turbolink 主要是减少 js 的重复执行时间,对于高配置的机器/高效率的浏览器来说,turbolink 带来的提升就越少。 turbolinks 的 readme 里把
In any case, the benefit ranges from twice as fast on apps with little JS/CSS, to three times as fast in apps with lots of it.
这句话链接到了 turblinks test 页面,但是和 turblinks_test 的结果不符合啊。。。这两倍是怎么算的?
还有,这个测试没提到是否使用了 web server。。。如果没有的话也不合理,因为实际环境静态文件会有 expires 头,不会进行条件 GET,而这个测试环境每次请求都要进行一次条件 GET 吧?
没看懂
...
phash 算法还可以计算非文本的相似度: http://rubylution.herokuapp.com/topics/15
赞 OpenSource 的方式
!