Hi,
Due to a recent change in youtube, the video_id in the first request (to the get_video script), and the videoid in the resulting redirect are different, which makes videocache download them both. I suggest changing the recommended regex in squid.conf to reject google's cache servers.
13 Answers
Imriz,
Can you please cite an example. I didn't really get this.
Thank You!
Consider the following get:
--13:29:54-- http://www.youtube.com/get_video?video_id=eMvDax3AuP4&t=vjVQa1PpcFN_VmUBILxH3sDqc-eMoDh9gTA8XTOgTco=&el=detailpage&ps=&fmt=34
Connecting to www.youtube.com|208.65.153.238|:80... connected.
HTTP request sent, awaiting response... 303 See Other
Location: http://v6.cache.googlevideo.com/videoplayback?id=78cbc36b1dc0b8fe&itag=34&ip=212.199.24.93®ion=0&signature=47438D8A39678E6C830127C5ADFB25E605887C24.7DD4CA62178EE9504414B83E8DA430FA80E324DF&sver=2&expire=1235323795&key=yt1&ipbits=0 [following]
--13:29:55-- http://v6.cache.googlevideo.com/videoplayback?id=78cbc36b1dc0b8fe&itag=34&ip=212.199.24.93®ion=0&signature=47438D8A39678E6C830127C5ADFB25E605887C24.7DD4CA62178EE9504414B83E8DA430FA80E324DF&sver=2&expire=1235323795&key=yt1&ipbits=0
Resolving v6.cache.googlevideo.com... 74.125.99.223
Connecting to v6.cache.googlevideo.com|74.125.99.223|:80... connected.
Notice that the video id in the get_video request and in the videoplayback request are different. videocache will try to download them as seperated items.
Imriz,
Ok. Then we should be caching the videos only from youtube and deny the videos from googlevideo servers. Can you try once that in your setup? Just deny them in squid.conf and they'll never reach videocache for download.
But here I am confused about the choice. We should deny youtube ones or the googlevideo ones. Whats your insight on this?
Thank You!
Hi,
With transparent proxy in mind - I would suggest denying the googlevideo ones.
Kulbir, Imriz, ...
I wrote a page about ... after a lot of testing ...
http://www.bellera.cat/josep/videocache/squid_videocache_youtube.html
Regards,
Josep Pujadas
Great stuff.
After a lot of testing i menaged to force youtube caching to work well.
This is my config:
# --BEGIN-- videocache config for squid
#url_rewrite_program /usr/bin/python /usr/local/videocache/videocache.py
url_rewrite_program /usr/local/squid/bin/zapchain "/usr/local/squid/bin/gg_rewrite" "/usr/bin/python /usr/local/videocache/videocache.py"
url_rewrite_children 5
acl videocache_allow_url url_regex -i www\\.youtube\\.com\\/get_video\\?
#acl videocache_allow_url url_regex -i \\.googlevideo\\.com\\/videoplayback \\.googlevideo\\.com\\/get_video\\?
#acl videocache_allow_url url_regex -i \\.google\\.com\\/videoplayback \\.google\\.com\\/get_video\\?
#acl videocache_allow_url url_regex -i \\.google\\.[a-z][a-z]\\/videoplayback \\.google\\.[a-z][a-z]\\/get_video\\?
#acl videocache_allow_url url_regex -i (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\
#acl videocache_allow_url url_regex -i (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\
#acl videocache_allow_url url_regex -i proxy[a-z0-9\\-][a-z0-9][a-z0-9][a-z0-9]?\\.dailymotion\\.com\\/
#acl videocache_allow_url url_regex -i vid\\.akm\\.dailymotion\\.com\\/
#acl videocache_allow_url url_regex -i [a-z0-9][0-9a-z][0-9a-z]?[0-9a-z]?[0-9a-z]?\\.xtube\\.com\\/(.*)flv
#acl videocache_allow_url url_regex -i bitcast\\.vimeo\\.com\\/vimeo\\/videos\\/
acl videocache_allow_url url_regex -i va\\.wrzuta\\.pl\\/wa[0-9][0-9][0-9][0-9]?
#acl videocache_allow_url url_regex -i \\.files\\.youporn\\.com\\/(.*)\\/flv\\/
#acl videocache_allow_url url_regex -i \\.msn\\.com\\.edgesuite\\.net\\/(.*)\\.flv
#acl videocache_allow_url url_regex -i media[a-z0-9]?[a-z0-9]?[a-z0-9]?\\.tube8\\.com\\/ mobile[a-z0-9]?[a-z0-9]?[a-z0-9]?\\.tube8\\.com\\/
#acl videocache_allow_url url_regex -i \\.mais\\.uol\\.com\\.br\\/(.*)\\.flv
#acl videocache_allow_url url_regex -i \\.video[a-z0-9]?[a-z0-9]?\\.blip\\.tv\\/(.*)\\.(flv|avi|mov|mp3|m4v|mp4|wmv|rm|ram)
#acl videocache_allow_url url_regex -i video\\.break\\.com\\/(.*)\\.(flv|mp4)
#acl videocache_allow_dom dstdomain v.mccont.com dl.redtube.com .cdn.dailymotion.com
acl videocache_allow_dom dstdomain dl.redtube.com
#acl videocache_deny_url url_regex -i http:\\/\\/[a-z][a-z]\\.youtube\\.com http:\\/\\/www\\.youtube\\.com
acl videocache_deny_url url_regex -i http:\\/\\/[a-z][a-z]\\.youtube\\.com
url_rewrite_access deny videocache_deny_url
url_rewrite_access allow videocache_allow_url
url_rewrite_access allow videocache_allow_dom
url_rewrite_access allow GG_banner
redirector_bypass on
# --END-- videocache config for squid
As You see i commented all googlevideo acls, and not used ones.
I changed youtube acl to match www.youtube.com
After that Youtube url's are requested only once but i had to change hit_threshold to 3 because some youtube url's reply with "we're sorry this video is no longer available" and users usually click refresh to fix this what causes another request and unnecessary caching. It happened before so it's not videocache issue.
Big thanks for this stuff. I searched for something like this for a very long time.
Gandi,
Thanks for the compliments and sharing your experience with videocache.
Keep caching :D
I found another bug.
When user tries to seek through youtube or redtube video, it makes another request.
Had to modify config to something like this:
[...]
acl videocache_allow_url url_regex -i www\\.youtube\\.com\\/get_video\\?
acl videocache_allow_url url_regex -i dl\\.redtube\\.com\\/(.*)\\.flv\\?start=0
acl videocache_deny_url url_regex -i http:\\/\\/[a-z][a-z]\\.youtube\\.com www\\.youtube\\.com\\/get_video\\?video_id=.{11}&(start|begin)=
[...]
Gandi,
Thats not the case actually. When user seeks, the video is requested again by the client. If the video is in queue (was requested previously), re-requesting will just increase its priority and will not force another download. So, it can be ignored.
Thank You!
Hi again!
I realize that it will not force another download, but it requests video another time.
Let's see example :)
hit_threshold = 2
User is starting to watch youtube video - it is first request.
User seeks through video - second request and videocache is starting download.
It's not desireble .. i think.
One user can start caching video just by seeking through file.
Sorry for off topic.
Gandi,
Your argument is valid but the regex you proposed will create problems seeking in videos which have been cached by videocache. Moreover, when one sets some value for hit_threshold, I think it may be fine to go with +1 or -1.
Thank You!
Hi again!
You are right, it makes problems seeking in cached videos :( but theye are served from cache and loaded to browser vary fast.
Is there a possibility to add a feauture to avoid getting requests from single IP in particular time? I think it should fix it.
Big thanks again!