The reorganization of a subcommittee (SC) does not usually attract a lot of attention. Then again, most of the groups that carry out standardization work do not win Emmy awards for their contribution to the entertainment industry. The IEC and ISO joint technical committee (JTC 1) believe that the new structure will enable SC 29 to work with greater agility as it develops the next generation of MPEG and JPEG standards.

IEC and ISO implemented the changes after an 18-month process of evaluation. SC 29 members voted overwhelmingly to approve the reorganization, which sorts the areas dealt with into eight working groups. The new structure elevates some of the current subgroups of WG 11 (MPEG) to become distinct MPEG working groups (WGs) and advisory groups (AGs) of SC 29, rather than subgroups of one of its WGs.

Teruhiko Suzuki of Sony Corporation will complete his term as chair of SC 29 at the end of this year. His successor, Microsoft’s Gary Sullivan, will begin his mandate at the start of 2021. Mike Mullane from IEC interviewed them about the reorganization and asked them about the future challenges for SC 29.

MM: I suppose that some resistance to change is always to be expected. But what is your answer to those who say that given all the success over the years, there was no need to change anything: “If it ain’t broke, don’t fix it”.

Teruhiko Suzuki: It’s a good question. MPEG and JPEG in SC 29 was established nearly 30 years ago. The groups were designed for past ecosystem applications like TV. Moving pictures and still pictures were a very different industry. But the evolution of technology has advanced rapidly. Now from a technical perspective the video industry and picture industries are overlapping. The difference between 3D graphics and video is blurring. The area is expanding, and the industry has changed, so a new structure for the next generation was necessary.

Gary Sullivan: As Teruhiko says, we had a diversifying set of applications and we’re listening to the perspectives of the national bodies ultimately, that are participating in the work. They believe that a restructuring is needed. We had, particularly in the MPEG Working Group, many, many different projects being pursued in an increasingly diversified manner. There were quite a few proposals for different types of restructuring. There was even a proposal to make MPEG a separate subcommittee at one point. I guess some people thought it had become too diversified to be a working group.

MPEG has, for quite some time, had what it has called subgroup chairs within a working group. Some people felt that it was time to recognize those subgroups as working groups in their own right, rather than just treating them as something informal within a working group. Therefore, we basically elevated those subgroups of MPEG to become working groups. But I think we plan to keep much of the same style of how we hold meetings. At the working group level, at the participant level, there is less change than at first appears. Simply, what were subgroups are becoming working groups. But nothing much more than that is actually happening. We’re trying very much to preserve the existing culture and make sure that we don’t interrupt the flow of work in in either JPEG or MPEG.

MM: One of the concerns was that elevating subgroups to working groups might make it harder for them to work together. How are you making sure that it doesn’t become more difficult to coordinate?

Teruhiko Suzuki: At least in the foreseeable future, all the MPEG working groups will continue to meet together at the same location and in the same weeks. That’s not something mandated by the directives of how working groups need to operate, but there’s a strong cultural identity that has built up around MPEG. People like to be able to participate across different working groups and different areas of activity in MPEG. They like to see what’s going on and try to, as you mentioned, try to make sure that the work is well coordinated. We want to maintain the branding of MPEG, the community of experts that comes together. We have had MPEG meetings about three or four times a year. And we plan to continue doing that, at least as soon as the COVID-19 situation allows us to resume regular meetings.

MM: What has been the secret of SC 29’s success and how important was it to win those Emmys?

Teruhiko Suzuki: MPEG provides audio-video entertainment to people. It provides a basic infrastructure for people’s entertainment. Before MPEG, people just watched normal TV, but MPEG provided data compression technologies for audio and video. Thanks to MPEG, people have been able to enjoy digital TV, HDTV and 4K, and audio is also 3D audio, lossless audio, high resolution audio. And in SC 29 we also have JPEG. The JPEG codec was developed about 30 years ago and continues to make an important contribution to people’s entertainment. Emmys are the highest award for entertainment on television and cinema systems. It shows that people really use MPEG and JPEG. The awards were proof of our contribution and a big honour.

MM: Looking to the future, a key issue is industry collaboration and how major corporations are now creating products that are beginning to compete with MPEG standards. What impact will that have on the future of MPEG in your view?

Gary Sullivan: This is not a new thing. Some people forget some of the previous instances where there were alternatives developed for the standards that have been developed in our domain. The alternatives have existed for as long as the standards have existed for video, audio and still pictures. We have a pretty good track record of the standards that we developed being adopted. I think we achieved that by having very broad participation, which creates things that are well thought out and well-studied, and just generally agreed to be an appropriate design for worldwide industry to use.

But patent licensing has been a significant problem for a long time. Sometimes it takes years for the licensing situation to settle out for some of our standards. And ultimately, if a reasonable licensing scheme does not develop around a standard, it will not succeed. It will not become what it should be if the patent holders don’t agree on a licensing regime that makes it reasonable and predictable for people to license.

A standard is a sort of a value proposition, a combination of technology, capability and licensing requirements. But if the patent holders are not reasonable about it, it will not succeed in the market. Patents do expire eventually. So, if you can develop technology that doesn’t require the use of some patents, there could be a competitive advantage to that. We have had a number of industry consortia and regional or national standards bodies that have technology in the same space as standards developed in MPEG and JPEG. And so far, I think we’ve got a pretty good track record of adoption. But there’s no guarantee of success just by having a standard issued from us. The patent holders, who we cannot control, have a large say in how successful a technology will be in the market. And if the licensing situation does not work out, a standard will not succeed.

Teruhiko Suzuki: In terms of collaboration, standards experts from companies can provide the requirements from industry. They provide a more balanced vision of technology. In SC 29, I think we have a good balance between academia and industry and it’s one of the reasons why we’re successful. One challenge is that some participants are also involved in consortia outside ISO, IEC and the ITU. People participate in those consortia and develop work specifications for those applications. Of course, we are competitive, but we must all work together to develop a new market. Otherwise it’s quite difficult. A single company cannot develop one application domain anymore, so collaboration is necessary.  

MM: Which new technologies will have the most impact on SC 29’s future work, do you think? AI?

Gary Sullivan: Yes, AI is extremely promising. People everywhere are interested in AI now for all sorts of applications. There’s one feature in our latest video coding that was designed partly using AI, but it’s not using AI necessarily in the encoding or decoding process. Some people have been able to optimize encoders using machine learning techniques. But the real breakthrough will be when we start doing things like compression or decoding using AI. We haven’t really gotten to doing that yet in a standard, although there’s lots of interest and people are starting to work on it and the results are really starting to come together. Things are evolving very rapidly.

 I think that the signal processing architecture that you need for some of that is quite different than what’s traditionally been built into media systems. We’ll have to see when all that’s ready for standardization, but it’s a big area where people are showing a lot of interest and we’re starting to do a lot of studies. We don’t know yet where that’s all going to settle out.

We’ve done some standardization on the compression of a neural network itself, rather than necessarily compressing, say video or compressing audio using the neural network. Compressing the neural network is an interesting topic that has gotten some recent work in MPEG. It’s not always practical to bring the data to the neural network. Sometimes you should bring the neural network to the data and try to do federated learning. Maybe you have distributed training of a neural network. A lot of these things are just becoming possible. But there’s lots of exciting applications and there’s lots of interest in JPEG and in MPEG lately. It’s one of the big, big areas that are being worked on now for potential future standards, although we can’t really say yet when it will be ready for real industry deployed standards.

MM: I’ve read somewhere that you’re looking at ways to apply your knowhow to compressing the human genome to a few megabytes.

Gary Sullivan: Yes, genomes is one of the most interesting new areas. We in SC 29 have a unique set of experts. We have people who understand compression very well. And I think you don’t find that elsewhere in the ISO IEC world, where you have people with deep knowledge of information theory, quantization, decorrelating transforms and all that stuff that is necessary for compression. So, we have been open to working on different topics that need efficient ‘coding’. We often refer to compression as coding.

As you know, the efficient representation of stuff and genomes is a very interesting new area. A full genome with all of the extra data that goes along with it, from reading gene sequencing data, is a huge amount of data. So if we can compress that well, in a way that’s designed to fit that kind of data, it can be a big benefit. And there’s, of course, going to be more of that kind of information and other kinds of information that we store.

I think medical applications is a key field. You think of CT scans and all sorts of medical imaging data that can end up being quite large that needs to be stored. And in SC 29, we have the expertise you need to do that kind of thing. So we’ve looked at a diversifying set of applications and have been working in consultation with other standardization groups so that maybe they designed the basic form of some data, what needs to be represented, and we might be working on efficient compression of that data.

So, really, although I think we’re best known for video and audio, pictures and maybe the systems multiplex that puts those things together and synchronizes them, there have been quite a few additional areas where standards have been developed in SC 29 for various applications. We’re looking at point clouds and light fields. We have standardized font compression. We have, as you mentioned, genome compression, compression of neural networks. We’ve worked on the multi-view representation of images and depth maps for 3D and free viewpoint imagery. 3D audio is quite an interesting, new area.

Another area where we have worked on is the compression of graphics. A distinction that’s important is that we don’t necessarily work on the source definition of these things, but we work on how to compress them and represent them well. We try to collaborate with other standardization bodies that have related technologies that might need the compression expertise they can find in SC 29. We also work very closely with ITU, particularly on video coding.