Suggestion: Download less data daily

An evolving, supported alternative to Rovi
Forum rules
★ Download the latest EPG123 here: https://garyan2.github.io/ <> Setup guide here: https://garyan2.github.io/install.html
Post Reply
User avatar
12noon

Posts: 127
Joined: Mon Oct 06, 2014 4:23 pm
Location:

HTPC Specs: Show details

Suggestion: Download less data daily

#1

Post by 12noon » Mon Mar 13, 2017 1:59 pm

Is it necessary to download all 14 days (or whatever) every day? Would there be a benefit to downloading less data every day? EPG123 could download only the next two or three days and also the "new" days (like the last two days of the specified span of listings data, so days 13 through 15, for example). This would reduce the amount of data downloaded and also greatly reduce the amount of time it takes EPG123 and WMC to process and import that data.

Of course and unfortunately, I suspect people might want this to be optional, so it would mean additional configuration options or command-line switches, which might make it undesirable to implement. ;) For example, once a week, I might still want to download the entire schedule just to be sure I have the latest everything.
USA 60005
WOW Chicago Suburbs - Digital
USA-IL58819-X

User avatar
garyan2

Posts: 7474
Joined: Fri Nov 27, 2015 7:23 pm
Location:

HTPC Specs: Show details

#2

Post by garyan2 » Mon Mar 13, 2017 2:55 pm

EPG123 already reduces the amount of download as much as possible. The cache folder with the thousands of files is almost everything needed to rebuild the current mxf file. At a minimum, epg123 will download the "new" days schedules and new programs that do not already exist in the cache. If there have been any updates to programs or schedules of the "old" days, then epg123 will download those changes to create the new updated mxf file.

The one thing that is not cached locally, and must be downloaded from SD on every update, is the station schedules Md5 values... imagine 21 days of 900 channels and it is not an insignificant message/response that epg123 is trying to complete. The problem with the server load is this is practically the fist transaction between epg123 and SD and when you have hundreds of people doing it at the same time the server get hit pretty hard.

Everything else is either importing existing data from the cache or downloading new data from the servers.

I will always create an entire .mxf file ... I proved in v0.8.x that it was possible to do incremental loads, but for whole home configurations and database rebuilds, it is necessary to have a complete file. Incremental is not going to be an option.
- Gary
Keeping WMC alive beyond January 2020. https://garyan2.github.io

rkulagow

Posts: 246
Joined: Sun Jul 19, 2015 1:04 am
Location: Schedules Direct

HTPC Specs: Show details

#3

Post by rkulagow » Mon Mar 13, 2017 3:16 pm

From the API side, each stationID has an MD5 for every day. If the client (EPG123) knows that the MD5 for stationID 20454 on 2017-03-20 is "foo", when it connects to the server it should ask for the MD5 for each of the stations that it's interested in. If
the MD5 for 20454 on 2017-03-20 is still "foo", then it doesn't need to download the schedule again, because it hasn't changed.

If the MD5 isn't "foo", then the client requests the schedule for that day and then downloads any programIDs that it doesn't already have cached locally.

That's why most client runs only take a few minutes, because the client should only need to download whatever has changed since the last time you've connected. The rest of your processing time is the local generation of the schedule and the import into MCE.

Space

Posts: 2839
Joined: Sun Jun 02, 2013 9:44 pm
Location:

HTPC Specs: Show details

#4

Post by Space » Tue Mar 14, 2017 12:43 am

In order to minimize resources when downloading new guide data, the old ReplayTV DVRs (which used a modem to download the data in the early days), used to only download program data that changed for each day, however it also only download data for certain days (not all of them).

The pattern was something like: day 1 (current day), day 2, day 4, day 8, & day 12. So it would skip any updates (even if they were available) for days 3,5,6,7,9,10, & 11 (the guide only had 12 days of data max).

I am not suggesting to do this here (I am not sure it would provide any significant benefit to anyone), just something I remembered and thought was interesting...

rkulagow

Posts: 246
Joined: Sun Jul 19, 2015 1:04 am
Location: Schedules Direct

HTPC Specs: Show details

#5

Post by rkulagow » Tue Mar 14, 2017 1:19 am

If every single MCE user ends up switching to EPG123 / Schedules Direct because they don't want to deal with Rovi anymore, then we can start thinking about these sorts of issues. But, that would be a good problem for us to have!

But honestly, the entire design of the API was around "minimize the transfers, but have as complete information as you can.". Please continue to request 21+ days worth of schedule if your system can handle it. We had a hiccup a few days ago that showed us that we needed to scale our servers, and since then I haven't seen any more threads here about issues, so we try to adjust our capacity to match the requests coming in.

Post Reply