Hello!
At our school we use squid with squidGuard + videocache.
With the caches (squid and videocache) and squidGuard (no add-ons) we can optimize our Internet connection. SquidGuard filters also undesirable contents for children.
But in same circumstances, specially the large video sites like youtube.com, there are adult contents difficult to detect.
For example, I found cached
http://www.youtube.com/watch?v=-5SglF2dazA
not market at youtube.com as adult content.
More than that. “Related videos” link at youtube.com page permits to go to a lot of similar videos. No way for squidGuard to detect this problem.
I was thinking if it would be possible to store the keywords for videos cached. If would be a very interesting tool to surf the cache by keywords.
Perhaps a simple text file or XML file at site cache folder would be enough.
youtube.com API permits to obtain all the information about one video, http://code.google.com/intl/en/apis/youtube/1.0/developers_guide_python....
The ideal (for us) would be to integrate this video database with squidGuard filtering.
Make this at real time it is perhaps impossible. We redirect first to squidGuard and second to videocache, using zapchain.
The easiest would be (perhaps) to process the database with a shell script and to add the undesired videos to the squidGuard filtering.
The most complete solution would be to include a list of forbidden keywords to videocache. Like this videocache could test the keywords for the video and decide if it must be cached/served or must be redirected to an “access denied” page (as squidGuard does).
Regards,
Josep Pujadas

