Problem with xml file after upgrade to 1.6.1.3
Forum rules
★ Download the latest EPG123 here: https://garyan2.github.io/ <> Setup guide here: https://garyan2.github.io/install.html ★
★ Download the latest EPG123 here: https://garyan2.github.io/ <> Setup guide here: https://garyan2.github.io/install.html ★
-
- Posts: 47
- Joined: Mon Jul 03, 2017 8:33 pm
- Location:
- HTPC Specs:
Problem with xml file after upgrade to 1.6.1.3
Hi,
I have written a utility that loads all the TV shows from the epg123.xml (now epg123.xmltv) file into two SQL SERVER tables. One for channels and the other for shows. I then export that data into a text file which can be imported into Excel. I use this to peruse upcoming premiere shows and movies, among other things.
After the upgrade, the SQL Server script which imports the XML into the database stopped working. Here's why:
The old xml file had this format:
<channel id="EPG123.11331.schedulesdirect.org">
<display-name>WCBS</display-name>
<display-name>WCBS</display-name>
<display-name>2 WCBS</display-name>
<display-name>2</display-name>
<display-name>CBS</display-name>
<icon src="https://schedulesdirect-api20141201.s3. ... w_270h.png" />
</channel>
The new xml fil has this format:
<channel id="EPG123.11331.schedulesdirect.org">
<display-name>WCBS</display-name>
<display-name>2 WCBS</display-name>
<display-name>2</display-name>
<display-name>CBS</display-name>
<icon src="https://schedulesdirect-api20141201.s3. ... w_270h.png" width="360" height="270" />
</channel>
Note that the second <dislplay-name> tag from the old file is missing from the new file. Presumably, because they are identical. When they are different in the old file, the second tag is present in the new file.
My code relies on the fixed locations of these tags. Here's how it's being inserted into SQL Server:
ELECT *
into channels
FROM OPENXML (@hdoc, '/tv/channel', 2)
WITH (
id varchar(50) '@id',
chan1 varchar(50) 'display-name[1]',
chan2 varchar(50) 'display-name[2]',
chan3 varchar(50) 'display-name[3]',
chan4 float 'display-name[4]',
chan5 varchar(50) 'display-name[5]',
icon varchar(100))
How do I detect when one (or maybe more) <display-name> tags are not provided for a specific TV channel?
I know this isn't a normal request, but I really like using this utility.
Regards,
Dan
I have written a utility that loads all the TV shows from the epg123.xml (now epg123.xmltv) file into two SQL SERVER tables. One for channels and the other for shows. I then export that data into a text file which can be imported into Excel. I use this to peruse upcoming premiere shows and movies, among other things.
After the upgrade, the SQL Server script which imports the XML into the database stopped working. Here's why:
The old xml file had this format:
<channel id="EPG123.11331.schedulesdirect.org">
<display-name>WCBS</display-name>
<display-name>WCBS</display-name>
<display-name>2 WCBS</display-name>
<display-name>2</display-name>
<display-name>CBS</display-name>
<icon src="https://schedulesdirect-api20141201.s3. ... w_270h.png" />
</channel>
The new xml fil has this format:
<channel id="EPG123.11331.schedulesdirect.org">
<display-name>WCBS</display-name>
<display-name>2 WCBS</display-name>
<display-name>2</display-name>
<display-name>CBS</display-name>
<icon src="https://schedulesdirect-api20141201.s3. ... w_270h.png" width="360" height="270" />
</channel>
Note that the second <dislplay-name> tag from the old file is missing from the new file. Presumably, because they are identical. When they are different in the old file, the second tag is present in the new file.
My code relies on the fixed locations of these tags. Here's how it's being inserted into SQL Server:
ELECT *
into channels
FROM OPENXML (@hdoc, '/tv/channel', 2)
WITH (
id varchar(50) '@id',
chan1 varchar(50) 'display-name[1]',
chan2 varchar(50) 'display-name[2]',
chan3 varchar(50) 'display-name[3]',
chan4 float 'display-name[4]',
chan5 varchar(50) 'display-name[5]',
icon varchar(100))
How do I detect when one (or maybe more) <display-name> tags are not provided for a specific TV channel?
I know this isn't a normal request, but I really like using this utility.
Regards,
Dan
- garyan2
- Posts: 7474
- Joined: Fri Nov 27, 2015 7:23 pm
- Location:
- HTPC Specs:
I don't speak SQL, but can you detect which display-name contains digits only and count up 2 from there?
- Gary
Keeping WMC alive beyond January 2020. https://garyan2.github.io
Keeping WMC alive beyond January 2020. https://garyan2.github.io
-
- Posts: 145
- Joined: Thu Sep 10, 2020 8:03 am
- Location:
- HTPC Specs:
Count the number of <display-name> tags.
If there are 5 then continue as is.
If there are 4 then insert the second one with the contents of the first.
If there are 5 then continue as is.
If there are 4 then insert the second one with the contents of the first.
- garyan2
- Posts: 7474
- Joined: Fri Nov 27, 2015 7:23 pm
- Location:
- HTPC Specs:
The only problem with that is this last change to xmltv file to be compliant with the DTD, there could be multiple channel numbers in there so you could have 5, 7, 9, 11 or 4, 6, 8, 10. The last line is the affiliate, but if there is no affiliate for the channel then it is reduced by 1. Can't do odd/even counts.
- Gary
Keeping WMC alive beyond January 2020. https://garyan2.github.io
Keeping WMC alive beyond January 2020. https://garyan2.github.io
-
- Posts: 2839
- Joined: Sun Jun 02, 2013 9:44 pm
- Location:
- HTPC Specs:
What is the purpose of all those display-names? Do you really need them all for you purposes? Why not just load the first one in to the DB and ignore all the rest?
I don't know much about the XMLTV format, but I think it is strange to have multiple display-names that may differ in count for each channel. How do you tell which is which when they all have the same "display-name" identifier and you cannot rely on the position to determine which is which?
I don't know much about the XMLTV format, but I think it is strange to have multiple display-names that may differ in count for each channel. How do you tell which is which when they all have the same "display-name" identifier and you cannot rely on the position to determine which is which?
-
- Posts: 47
- Joined: Mon Jul 03, 2017 8:33 pm
- Location:
- HTPC Specs:
I confess to not have a very strong knowledge on SQL Server's XML support. I don't know if I can look ahead and see how many tags there are, or what their content is. I am investigating that now. I'll report back with my findings.
-
- Posts: 47
- Joined: Mon Jul 03, 2017 8:33 pm
- Location:
- HTPC Specs:
The whole system is reference only for me. I eventually load the data into Excel so I have a comprehensive list of all the upcoming TV offerings that I can filter and sort to my purposes. While all the varied display-name tags aren't necessary for me, I particularly like the numeric one (the channel number by itself).Space wrote: ↑Sat Feb 13, 2021 5:40 am What is the purpose of all those display-names? Do you really need them all for you purposes? Why not just load the first one in to the DB and ignore all the rest?
I don't know much about the XMLTV format, but I think it is strange to have multiple display-names that may differ in count for each channel. How do you tell which is which when they all have the same "display-name" identifier and you cannot rely on the position to determine which is which?
I don't think your second question is directed to me. I've wondered the same thing about why the same tag name was reused and how epg123 distinguishes between therm, but I'm sure there's a good reason, and it's not my concern.
-
- Posts: 47
- Joined: Mon Jul 03, 2017 8:33 pm
- Location:
- HTPC Specs:
Here is my interim solution. I reverted my epg123 installation to the version with the fixed number of tags. I will continue to see if I can make the necessary checks on the tag counts/values in SQL Server, so I can use the current epg123. If not, I may install the older epg123 on a different PC for the sole purpose of providing input to my utility, and upgrade the PC that records shows on WMC so it has the latest and greatest.
I want to thank Gary and whoever assists him for this remarkable application, as well as his prompt and extremely insightful postings on this forum. Thanks to everyone else for suggestions and ideas.
I want to thank Gary and whoever assists him for this remarkable application, as well as his prompt and extremely insightful postings on this forum. Thanks to everyone else for suggestions and ideas.
-
- Posts: 145
- Joined: Thu Sep 10, 2020 8:03 am
- Location:
- HTPC Specs:
Under what conditions does the multiple channel numbers occur and how are the display-name entries ordered within the channel id tag?garyan2 wrote: ↑Sat Feb 13, 2021 2:49 am The only problem with that is this last change to xmltv file to be compliant with the DTD, there could be multiple channel numbers in there so you could have 5, 7, 9, 11 or 4, 6, 8, 10. The last line is the affiliate, but if there is no affiliate for the channel then it is reduced by 1. Can't do odd/even counts.
A channel here, KGW 8 (8.1) which is also broadcast on 23 (8.10). The sub channels too.
But only the one I have included in the guide is in the xmltv.
- garyan2
- Posts: 7474
- Joined: Fri Nov 27, 2015 7:23 pm
- Location:
- HTPC Specs:
Your 8.1 (KGWDT) and 8.10 (KGWZLD) channels are considered different stations with different station IDs so they won't be in the same channel construct of the xmltv file.
The display-names from epg123 are constructed like the following:
Finally the last entry will be the affiliate broadcaster which is typically on for the OTA channels.
So for an example, my local cableco has ActionMAX HD (Pacific) on 3 channels, and they will all be within the channel "epg123.59950.schedulesdirect.org"
The display-names from epg123 are constructed like the following:
- Callsign
- Name (if different from callsign)
- Channel number + callsign (if configured to include channel numbers)
- Channel number (if configured to include channel numbers)
Finally the last entry will be the affiliate broadcaster which is typically on for the OTA channels.
So for an example, my local cableco has ActionMAX HD (Pacific) on 3 channels, and they will all be within the channel "epg123.59950.schedulesdirect.org"
- ACMXHDP
- ActionMAX HD (Pacific)
- 224 ACMXHDP
- 224
- 1222 ACMXHDP
- 1222
- 1224 ACMXHDP
- 1224
- Gary
Keeping WMC alive beyond January 2020. https://garyan2.github.io
Keeping WMC alive beyond January 2020. https://garyan2.github.io
-
- Posts: 2839
- Joined: Sun Jun 02, 2013 9:44 pm
- Location:
- HTPC Specs:
Again, I don't understand how this can work. How is the program that reads in the XML supposed to know what is what in that list?
The only one you can be sure of is the first one, all the others appear to be optional, so you can have (for example):
callsign ABC
affiliate ABC+
Or you can have:
callsign ABC
name ABC+
They are identical without having any way of knowing the second entry for the first example is the affiliate and the second one for the second example is the name.
The only one you can be sure of is the first one, all the others appear to be optional, so you can have (for example):
callsign ABC
affiliate ABC+
Or you can have:
callsign ABC
name ABC+
They are identical without having any way of knowing the second entry for the first example is the affiliate and the second one for the second example is the name.
-
- Posts: 145
- Joined: Thu Sep 10, 2020 8:03 am
- Location:
- HTPC Specs:
I'm not seeming to get it either. Did a brief xmltv format search and didn't come across anything like that.
- garyan2
- Posts: 7474
- Joined: Fri Nov 27, 2015 7:23 pm
- Location:
- HTPC Specs:
From the xmltv.dtd
The contents and order of the display-name elements, if even included, are up to the designer. There is no standard to dictate what will be first, second, or last. I just included patterns that I have seen other developers use and it will be up to the consuming software to determine which it wants to use, or none at all.
This is also why I made including the channel numbers optional to remove all the noise.
Code: Select all
Each channel has one id attribute, which must be unique and should
preferably be in the form suggested by RFC2838 (the 'broadcast'
element of the grammar in that RFC, in other words, a DNS-like name
but without any URI scheme). Then one or more display names which are
shown to the user. You might want a different display name for
different languages, but also you can have more than one name for the
same language. Names listed earlier are considered 'more canonical'.
Since the display name is just there as a way for humans to refer to
the channel, it's acceptable to just put the channel number if it's
fairly universal among viewers of the channel. But remember that this
isn't an official statement of what channel number has been
allocated, and the same number might be used for a different channel
somewhere else.
...
<!-- A user-friendly name for the channel - maybe even a channel
number. List the most canonical / common ones first and the most
obscure names last. The lang attribute follows RFC 1766.
-->
<!ELEMENT display-name (#PCDATA)>
<!ATTLIST display-name lang CDATA #IMPLIED>
This is also why I made including the channel numbers optional to remove all the noise.
- Gary
Keeping WMC alive beyond January 2020. https://garyan2.github.io
Keeping WMC alive beyond January 2020. https://garyan2.github.io
-
- Posts: 2839
- Joined: Sun Jun 02, 2013 9:44 pm
- Location:
- HTPC Specs:
It seems that these values are meant for a human to look at and choose, so there is no built-in way to determine what is what other than the first one on the list is the most "canonical" and then they get less so.
So again, I would say, just use the first one and ignore all the others if you are not going to be passing them on to a human to pick manually (which in the case of loading this in to SQL, it appears you are not).
If you will, then I would just put them all in the same text field with a delimiter rather than giving each one it's own DB field. Alternatively, make X number of fields (maximum display-names you think there will be) and then populate them sequentially, leaving the rest null. Of course this means if you make 10 fields and there happens to be 11 display-names, you will have to drop one.
So again, I would say, just use the first one and ignore all the others if you are not going to be passing them on to a human to pick manually (which in the case of loading this in to SQL, it appears you are not).
If you will, then I would just put them all in the same text field with a delimiter rather than giving each one it's own DB field. Alternatively, make X number of fields (maximum display-names you think there will be) and then populate them sequentially, leaving the rest null. Of course this means if you make 10 fields and there happens to be 11 display-names, you will have to drop one.
-
- Posts: 145
- Joined: Thu Sep 10, 2020 8:03 am
- Location:
- HTPC Specs:
If include channel numbers is enabled then it should be possible to use regex to determine field type by content and position.
1st is Callsign
2nd is Name if content is not a channel number + callsign. Or if 3rd and 4th are a channel number + callsign and channel number pair.
channel number + callsign and channel number pairs.
...
...
last is Affiliate if content is not a channel number. Or if last -2 and -1 are a channel number + callsign and channel number pair.
Otherwise if include channel numbers is not enabled I see no way to distinguish between a name and affiliate field. Unless both exist, or neither exist.
Channel Number + Callsign RegEx
String begins with one or more decimal digits: ^\d+
Optionally followed by a decimal point and one or more decimal digits: (\.\d+)?
Followed by one or more white spaces: \s+
Followed by the callsign to end of string: <callsign>$
Channel Number RegEx
String begins with one or more decimal digits: ^\d+
Optionally followed by a decimal point and one or more decimal digits to end of string: (\.\d+)?$
1st is Callsign
2nd is Name if content is not a channel number + callsign. Or if 3rd and 4th are a channel number + callsign and channel number pair.
channel number + callsign and channel number pairs.
...
...
last is Affiliate if content is not a channel number. Or if last -2 and -1 are a channel number + callsign and channel number pair.
Otherwise if include channel numbers is not enabled I see no way to distinguish between a name and affiliate field. Unless both exist, or neither exist.
Channel Number + Callsign RegEx
Code: Select all
/^\d+(\.\d+)?\s+<callsign>$/
Optionally followed by a decimal point and one or more decimal digits: (\.\d+)?
Followed by one or more white spaces: \s+
Followed by the callsign to end of string: <callsign>$
Channel Number RegEx
Code: Select all
/^\d+(\.\d+)?$/
Optionally followed by a decimal point and one or more decimal digits to end of string: (\.\d+)?$