G+_Adam EL-Idrissi Posted December 18, 2015 Share Posted December 18, 2015 Is there a way to automatically download new files? I want to have a way to automatically download the newest audio of know how as well as other twit shows. I know there are podcast apps and such to do it but not sure if there is a way to do it in Linux. Plus I want to do that style if check and auto download for other sites when they upload new videos or audio. Link to comment Share on other sites More sharing options...
G+_Jeff Brand Posted December 18, 2015 Share Posted December 18, 2015 The short answer is yes... A cron job and a little scripting would do it. Just remember to check infrequently (>2 hours, more like every 12 hours) to keep impact to a minimum. rsstail looks like it does much of what you need to get started: https://github.com/flok99/rsstail Link to comment Share on other sites More sharing options...
G+_610GARAGE Posted December 18, 2015 Share Posted December 18, 2015 If you don't mind a gui, gPodder is cross platform. I use in under windows to get Know How and it works great. gpodder.org/downloads Link to comment Share on other sites More sharing options...
G+_Adam EL-Idrissi Posted December 18, 2015 Author Share Posted December 18, 2015 Jeff Brand? I'll look into that. 610bob? I don't mind a gui but I was thinking about running it on a pi or in a freenas jail(once I rebuild it). I'm still learning how to use the terminal and programming (although I'm horrible at both) Link to comment Share on other sites More sharing options...
G+_Adam EL-Idrissi Posted December 18, 2015 Author Share Posted December 18, 2015 A quick look at rsstail seems to be part of what I'm looking for. I'll have to do a little looking around for the rest. Is there a difference between wget and curl or are they both essentially the same? Link to comment Share on other sites More sharing options...
G+_Eddie Foy Posted December 18, 2015 Share Posted December 18, 2015 RSS feeds. I have Download Station on a synology NAS get various podcasts via RSS feeds. Link to comment Share on other sites More sharing options...
G+_Jeff Brand Posted December 18, 2015 Share Posted December 18, 2015 Adam EL-Idrissi curl & wget are essentially the same for the purpose of retrieving a single URL. For other uses, wget has some HTML parsing features that are useful when trying to archive/crawl web pages, as well as support for FTP. Link to comment Share on other sites More sharing options...
G+_Adam EL-Idrissi Posted December 19, 2015 Author Share Posted December 19, 2015 I see how RSS would work but I mainly have been looking around to have something look for updates on websites (like defcons media server) and then when it sees a new file, downloads it automatically. From what I've seen with RSS is it just let's you know. I guess my main question is there a way to program or have a script to do this to run on a RPI server. Not really a programmer so how to get started is above my head. Link to comment Share on other sites More sharing options...
G+_Jeff Brand Posted December 19, 2015 Share Posted December 19, 2015 Website change detection is a complex task. It can be done, but it depends on the site. It helps if the markup is well structured and makes minimal dynamic changes. In the old days, the category was called "screen scraping" Take a look at urlwatch as a starting point: https://thp.io/2008/urlwatch/ The benefit to RSS is that it's a standardized format and it's designed to make it easy to see when new data is added and where the important content is located. You'll likely need some HTML DOM parsing tools, and then some scripting to make it do what you need. It'll follow a similar train of thought to this: "In the content area labeled 'main_content' look for all links that start with 'news/' and then download them". The Pi could handle this task depending on the volume of sites being checked. You could check a lot before reaching its limit. Link to comment Share on other sites More sharing options...
Recommended Posts