Monday, 14 October 2019
MARIA ISABEL GRANDIA: Hello. Hi. Can you hear me? Good afternoon, I am starting the second session in the afternoon, please sit down because we want to keep time, if possible.
And I am Maria, I am co‑chairing this session with Peter Hessler, and we are having two plenary presentations, two 30 minute slots and three lightning talks. And if you have any questions, remember you have to go to the microphone and wait in line and if you have ‑‑ if you listen and someone has already asked something and you wanted to ask and you can just sit down again.
Okay. Our first presenter today is Ilona Stadnik, she is on the RACI programme so one of the presenters that came from the researcher side of the world and she is talking about Internet governance in Russia, trend on sovereignisation.
ILONA STADNICK: Good afternoon. It's quite an unusual experience to talk to technical community people because I am from policy, and today I am going to talk about Internet governance in Russia from the policy perspective and I am sure that quite a few of you have heard about the recent Russian law on stable [rue] net, sovereign, more popular title. And actually, I really want to talk about how we end up in such situations, so because it was really kind of a long‑lasting process of policy development.
So, basically, to structure my talk, I will use very useful framework, theoretical framework of cyberspace on international borders. Basically it happens on three tracks, so the first is national securitisation process and you see that it kind of unfolds in several sections. It's reframing of cybersecurity as national security issue, it's militarisation of cyberspace, nationalisation of threat intelligence, rely on some national standards and technologies and reassertion of legal authority for network kill switches. So it's really like a policy stuff.
The second track is territorisation of information flows and it's really about data and content and how it is restricted and localiseed inside the territorial borders.
And the last and the most I think interesting to you will be the third one, the efforts to structure control of critical Internet resources along national lines. And I will focus more on this part. Due to time constraints, I won't go too deep into details about the first two but I am really open to questions, so if you have questions about this part, they are welcome.
So, I must make a remark that, in Russia, cybersecurity is not the proper word, and on the official level, people don't use it. Instead, we have information security, and honestly, this is kind of a broad term because it encompasses both cyber that is about infrastructure and information that more is about content and tat. And the policy formulation of this information security, cybersecurity started in 2000s, it was the first doctrine on information security and then in 2016 it was updated. It's curious document, actually, and we have translations of it into English, so if you are interested, you can Google it.
And this was placed, like, information security, cybersecurity is a national security priority so that is how it started.
The militarisation part is all secure thing because on official level, Russia was kind of denying having kind of information troops, cyber troops, like Cybercom in the United States, but in 2017 it was kind of intentionally or not leaked in media that we really have them and some notes goes back to 2013, but it was kind of announced in 2017, actually.
The next thing is nationalisation of threat intelligence and, here, we have a long history of law‑making and kind of organisational policy because, here you can see two observations, they are kind of frightening for you but state information system for detection prevention and elimination of consequences of computer attacks on networks. So basically, just to put it simple, it's like a wide nation certificate that collects data on cyber incidents and accumulates information on threats and all this stuff.
The second abbreviation is National Coordination Centre For Cyber Incident Response, and interestingly, it was established just last year, and it absorbs the functions of the gov certificate that serves on government networks and the only important thing here is that this centre in responsible for international interactions between in Russia and abroad, so now all the data sharing should go through this centre and not vice versa, if Russian source have particular agreements with other sources to cooperate.
And, yes, we still have some private CERTs and the CERT of the gov id is efficient and ‑‑
Reliance on national standards, we have this programme on software, it started like in 2015 and obviously it was the result of the anti‑Russian sanctions, but I must admit that this is not a very successful policy development because we have this registry for the Russian software but there is no real practical use of this.
And now we have this talk of nationalisation of the index company and this is very recent developments so I have no opportunity to update my slides but probably you follow the news and see the shares of the index have fallen.
The price of ‑‑ yes, the price of the shares.
So, and the last one is the reassertion of the legal authority for network kill switches. Here, I would like to point you ‑‑ to point out that in Russia there is a discourse of external kill switch. It may be very surprising for some of you, but because we have, not we but there is quite an established practice of local Internet shutdowns in different states, predominantly authority tare Jan, but nobody was really talking about external kill switch, and this is what is the Russian official possible in that Internet really can be kind of cut off for particular country, externally from different organisations, but not internally, and this is ‑‑ this position really led to what was developed further as a law.
But still, we have documented cases of local shutdowns of mobile Internet and in recently in August in Moscow during the release about Moscow elections. So, kind of, state have kind of ability to do this.
Then, very briefly, the second broad part of alignment of cyber space top national borders is the territorisation of information flows. Basically, there are two strands, it's content filtering and data localisation. The filtering practices started in 2012, with the adoption of specific laws that described the particular type of content of information that should be prohibited for dissemination on the Internet. Basically, it's like drugs, suicide, child pornography, extraneous content but the problem is that this list of prohibited information is extending each year and we are adopting more laws on this. Yes, we have this blacklist that is one by the Russian supervising body for communications, it's Roskomnadzor, special systems to monitor the operators' compliance, whether they take down the prohibited content and just recent we have this law that requires search engines to connect to this state system with the blacklisted sites to automatic filter it it, but here is the situation you can go, does not kind of subscribe to this system and for what we have seen from the media, it's a very strange kind of arrangement between Roskomnadzor and Google that they manually remove the links from the search results that leads to some prohibited information but this is how it works.
And the last part is data localisation, it was even before the GDPR in Europe, we have this law, on data localisation and data protection of personal data and for ‑‑ as for the consequences, LinkedIn as a social network is blocked because they refused to move their service and store the Russian data locally. But still, in regard to social services, social network, social sites, Roskomnadzor is kind of knot willing to make a decisive step and just block them as he did with content. So, everything is possible in Russia, you see.
You can violate the law but you can still function.
And we move to the main part of the talk, so, as for the control over critical Internet resources, the law that was adopted this spring, was signed this spring and should be enacted in November was not the first of its kind, so basically, in 2016, there was already kind of a draft law that deals with critical Internet infrastructure in Russia. There were several kind of cyber drills that tested the reliability of the rugs kind of segment of Internet, and what it can be kind of switched off. Well, these drills was kind of under the seek resee, but there was some official results of it and it was said it's possible so we need to do something with this. And the first draft was kind of copying the DNS services, the RIPE databases for routing, so just a copy to refer if something wrong happens. Here we can see the prototype of the special information ‑‑ other special information system that operators should kind of consult in their work, but this draft never went further, due to various reasons, and just surprisingly we got a new draft law in ‑‑ last winter, last December, and it's immediately got the catchy title of the law on search unit, but as I told in the opening that it is the law on this table functions so you can see the difference. And you see that in the explanatory note and I have, the aim of this law is to protect the Russian segment from ‑‑ probably for technical people this phrase does not make any sense, for policy people it's kind of real threat and kind of impetus to enact such a law. And you see it took only six months, it's incredible speed, to adopt this law, to go thourgh all the debate and discussion stages in the legislative body, and it was signed this April by the president as a law, and it should be enacted into force by the 1st November, so just half a month ahead.
What ‑‑ so this is ‑‑ just to inform you, this is not like the law from normal law, so we have a law on communications, we have a law on information, and this particular law is set of amendments to both of them. And to be honest, it looks like a broad framework, what should be really done about critical infrastructure in Russia. But, altogether it gives a lot of powers to the supervising body so it really transfers from supervising to managing body and this is really strange. And one of the main reasons to push for this law was the failure of the supervising body to block the Telegram mass injury, a brief history: It was blocked by court decision last year, for the non‑compliance for another law that Telegram is also kind of secured encrypted messenger and under the special provisions of the anti‑terrorist amendments, such messenger should kind of hand over the encryption keys so law enforcement agency can access your messages in case they need it and Telegram refused and it was blocked for this. And it was a lot of effort to block it and it tried a lot of technical solutions to do this but they were all unsuccessful, and with this new law, it is kind of a hope that by, in extensive traffic filtering that they can bring the before down.
So, what is inside the law? The law kind of make definition, who is the main subjects for network regulation. So it's technical communication networks are owners, owners of autonomous system numbers, owners of traffic exchange points, communications lights crossing the state border. That's it. So basically, they will be kind of keep track of them, who is who, who is owning what and all this stuff. Also, the supervising bodies now have the power to make centralised management of the communication networks in the event of threats and this is very important point because under the logic of the law, if something happens externally to the ‑‑ and internally too, in terms of content, the routing policies should not be implemented by the network operators but by the resource centrally so it's like a top down hierarchical system.
And the last important part is that network operators are obliged to establish ‑‑ to install a special technical equipment, in Russia it is called technical measures to combat threats, but basically, it's like a twofold equipment so it should execute the central management of traffic routing and filter the content, and so far, it's like DPI system so from what we have heard from the mass media.
And the final point the law creates a centre for monitoring and control of public communication networks, that will be run by Roskomnadzor and the curious point is the national domain system. So, the national domain system is still under the development, put it like this, because we have ‑‑ because we have regulatory act that says that national domain systems comprised of the domain .ru, domain dot reference ‑‑ dot S U and domains that are owned by the Russian legal entities, that's it. But no words about how it will be formed, how it will be used and it is ‑‑ it is still should be kind of developed, and so what is going on until November 1st:
As I said, the law is more like a framework, and it has a lot of blind spots inside because it doesn't tell exactly what operators and other participants of this regulation should do. And this 40 regulatory acts should fill executive gaps. But today so far, only five of them have been ‑‑ only five of them have been passed the approval and basically, they extend the powers, so it's like organisational thing, so they extend the powers of the minister for digital development, former minister of communications, extend the powers, creates the centre, and have this list of what comprises the national DNS system. That's it.
Out of 28 prepared for consideration regulatory acts, 17 of them violated the procedure for introducing new regulatory acts for public discussion, and got a negative evaluation for their regulative power. So what does it mean? So basically, people, like technical community, can comment on these regulatory acts when they are discussing and drafting, but the problem is that, in majority of cases the comments are neglected and when the drafting period of these regulatory acts are closed, sometimes they are not taken into account at all.
And so what about the filtering capacity of this technical means, technical measures? Roskomnadzor announced in the euro region they started the equipment but I wanton focus the technical specifications of this technical means are still not kind of officially adopted as a regulatory act so it's really interesting what exactly they are testing if there is no special requirement for this.
I will wrap up. So, as I told you, I am a policy person, and I was trying to explain how things are going in terms of sovereignisation from political side. And you see that the political intention of the government to align its cyberspace to national borders is really strong and you see this gradual development in different spheres. However, the most important ambition to align the technical part of the network is still under consideration and nobody really knows how to do this, but the claim to make Internet, Russian Internet independent from the global Internet in terms that Russia needs it, is really kind of tempting for like a political model for other states, and that's why this unique discourse to keep Internet accessible despite internal shut down can be seen as a desirable state of affairs for other states with authority tar Jan regimes, for example. But so far what we can see from the developments regarding this law, is that the technical part, they just decided to put it aside for a while and concentrate on the filtering practices, and so, for now, it's like a question of whether the government will be able to make this surgical shutdowns in regions of Internet and filter traffic better than they do now, because even we have this very extensive blacklist of IP addresses, URLs, and DNS addresses, sometimes it's ‑‑ it doesn't work, and so people still can access some resources, and the big hope for this law, for the government, is to stop this, stop this kind of violation of current laws and to make it like more strict.
I think that's it for now. I have three minutes left and so if you have questions?
PETER HESSLER: Thank you. If you have questions, please come up to the microphones, and state your name and affiliation before asking your question.
AUDIENCE SPEAKER: Alexander from Russia from Internet Protection Society. Thank you very much for this presentation, sometimes it works to go from Russia to Netherlands to understand machinations of your own government. I think we should ‑‑ we technical people should cooperate more with technical people to understand more what is going on. Also, I would like to suggest that we should cooperate between ourselves, because as I see some other European and even very democratic governments tries to ‑‑ with nearly the same, so we should raise awareness on this because even in your case the situation can go as bad as now in Russia, so let's cooperate and raise awareness of such incentives in other countries, thank you.
AUDIENCE SPEAKER: Ali from Iran. So the story is familiar to me because we do the same. I have got one question: Most of the protocols and the DNS the IP routing is it possible to have a kill switch to the Internet. What is Russia going to do about the certificate authorities because everything is now over TLS?
ILONA STADNICK: Well, not everything, right. I may be asking a very stupid question because I am not technical person. They really don't take too much attention to this, all this DNS over TLS, over https, they pretend they do not exist and that is why the design the law in this way so it's just neglecting the technological developments.
ALEXANDER ISAVNIN: The Russian authority once blocked commodity IP addresses so even government sites could not be identified, so they blocked some IP addresses but they will motivate this very strongly, they are just blocking some IP addresses and certification authorities failing somehow. Again, there is no general policy.
AUDIENCE SPEAKER: It's very hard to heard all of this mess because a lot of misinterpretation. Certification, it's easier just to revocate certificates for banking system and for ATM machines and to stop all financial system. It can be done on any level against any country in the world. And please don't mention the ‑‑ I don't know democratic governments in this world. Thank you.
JIM REID: That's a very interesting presentation and thank you very much for coming. I am just a random DNS guy that wandered in off the street. I think you answered my question indirectly, but initially to do with content filtering are the Russian authorities looking at DNS as a primary tool for for that and if so what are they going to do about ‑ deployment?
ILONA STADNICK: As I already told, they don't use only DNS for blocking and filtering, but they really ‑ this technology as over DOH, I don't know whether they really want to consider it because I have ‑‑ I haven't seen any records on this, but it's not very popular topic in Russia.
JIM REID: Thank you. I would love to have a chat with you over coffee or a beer later in the week thank you, thank you.
AUDIENCE SPEAKER: Ivan Beveridge from IG. I think this is kind of largely been covered but I'm wondering whether the authorities have ‑‑ have any consideration about their securing of routing protocols in DNSSEC and the like, whether they're looking to block that in ‑‑ for Russian organisations, or whether they don't see that as an issue with regards to the kill switch?
ILONA STADNIK: I think they don't see this as an issue.
PETER HESSLER: As there are no more questions, thank you very much.
So next up we have Sasha who is going to talk about software engineers for network engineers and others. Please remember to rate all the talks that you see, please give your feedback to the PC, additionly, there are two slots for candidates to the Programme Committee, they are open. Please send an e‑mail to pc [at] ripe [dot] net if you are interested. And it looks like we are ready. Go ahead.
SASHA ROMIJN: Thank you. So this is software engineering for network engineers, my name is Sasha, I have only a decade of professional software engineer experience. I have done some network stuff it, I used to work on the routing information services at RIPE NCC, also this stuff for the meeting network, a long time ago. And one of my ‑‑ I do a lot of Python contracting and Django web framework, the most popular Open Source web framework and I am also, I suppose about this a year ago, the main author of the Internet routing registry da Monday, version 4 for NTT which is a rewrite of ‑‑ they run their own IRR use and I ‑‑ I am a.m. core team member of ‑‑ write the docs, and a bunch of other things.
And so why would network engineers and other people, you don't have to be a network engineer for this, need to know about software engineering? Because a lot of people in here will end up writing software at some point, not maybe at the skill I do but you are likely to at some point write code or scripts or someone on your team is going to, so basically if you have written a 30 line dash script top automatic something you are a developer and it means you can benefit from learning how to do it right or at least to the degree that I can fit that in the length of this talk.
And so adopting good practices especially early on will make your product cheaper. I have seen a lot of cases of someone who at some point wrote a little script, a bit of Python or whatever they'd' like to use and they left ages ago, it's now five years old, nobody knows how to change it, it's somehow become business‑critical because people build around it and things were built on top of it, but it's actually poorly written and nobody dare to touch it and everybody starts to work around it and having good development practice will reduce these kind of dependencies, poorly written code. It will make it easier to change things, in the things you wrote but all of your processes because you are going to depend less on hard to maintain code. And also make you less depend on individual people, if other people can maintain something, you get to spread out the work more.
So the first thing is to use version control, not even all professional developers that I have encountered do this. It allows to you track changes to your code over time, which is essential because code will always change and it helps people trace back into history when was of this code added, area was it added, is there a context to it, if you are doing some software archaeology I often end up looking to source control, what happened to this code.
The the biggest one is Git, subversion, CVS, RC S, I don't recommend getting into the last three. For Git mr can you recall you don't need network. It's in Git, and you start to keep track of them, if I do it again I will probably tweak it and I get to keep history. You can sees things like GitHub or GitlLab to hostings, you get someone on the Internet to do it it, use the Cloud. If you couldn't know what to learn, you should learn Git because it is so incredibly wildly used in Open Source, it will be a useful skill for a very long time.
So, let's do a crash course in Git. To start using ‑‑ this is basically all you need. You set your user and your name at e‑mail once, create a repo, add your files and commit them. If you want you can push them to somewhere like GitHub, you don't have. If you put things on there they don't have to be public, eventually they will want some money from you but not a lot. When you make changes, you edit your local files, you have to explicitly add any new files, add a useful message to them, otherwise it's no use. Push your changes. This is basically all you need to know. You can put anything, even if it's small, in version control with basically no effort at all. They are a lot of very interesting useful features, sometimes useful features of Git, you can do fancy things with rebasing and restructuring your whole history and fold commits together, move them between branches but you don't need to know these kind of things top get started. He will eventually be useful if you do a bigger project. For IRD I really need them. This is all I need to put things in version control.
Second is to write tests, maybe the most important part of this talk. Having tests will, contradictory are, actually save you development time in the end. I sometimes see people, they think it's not worth it or going to take more time but in the end you will save time.
Bigger projects tend to have complex ‑‑ 4,000 lines of test that I wrote, but you can start very small with this, it's best of course if your product is small, so a single integration test can already add immense value.
Ideally with tests you want to test the weirder scenarios so everything goes well when all the users enter their data neatly and there are no strange situations, that is usually very easy, but especially the weirdest possible things that can happen, what if users input something in the wrong format or the JSON file you are trying to read has invalid syntax. What if somebody submits IPv6 addresses in an INETNUM, what if the inquiries parser fails, which it shouldn't, what error do we get back to the user? Do we not leak security sensitive information? All the weird cases are especially important to test.
It tends to grow with your code. Integration tests is basically something that you test end‑to‑end. We don't all use the same works for the same kind of test. This is kind of a tricky area. So, if you are going to have bigger products you will want smaller test zones so you can run through a lot of scenarios.
And if you also, once you have a test suite if you find an bug you can write a test for it which means you know you will appropriately identify the bug and be sure that you fixed it and also you know that it won't reoccur because your tests will fill.
So an example: I wrote a script, that calculates IPv6 route size stats, this may not be the most efficient, it builds the latest Route‑6 objects from the RIPE database dumps, finds anything that says Route‑6, which are the primary keys, gets the second field after the, counts it for unique values and source that again which basically means that it reads objects like this and then within this case count the 33 and so it would end up counting there is one Route‑6 subjector 116 and a lot for /48.
This code is basically non‑testable, you can try to see if it runs and say well it didn't produce an error, but numbers, but are the numbers right, it's hard to say because the input data keeps changing, which is why you would want input data that is the same every time so that you know what the correct output is. So we change it a little bit, and make it say if the first parameter say dash then you read all the route data from standard in so we can give a fixed input. If not download it as shown. This already becomes a lot more testable. Down side part of this isn't tested, where it retrieves the data, is this a valid parameter, we are not testing that, you could do that, but I can do it, but it won't fit on a slide. And still partial testing is better than no testing it. We are covering after lot more than you would have without. A simple Route‑6 object, a /33, and then this is basically all you need to do to test it, the test input on top is what I just showed you, and then the output we would expect is it says I found one route and it's a /33, you run the script, feed that data in and see if the output matches. You have now immensely increased the reliability of of this product, even though the test isn't perfect and not complete. The entire bottom part, that says if the test outset is the selected output then say the test is past blah‑blah‑blah, you don't have to do that in virtually any language. There are actually frameworks that will help you do this even in bash but I wanted to keep it simple, in a lot of languages this is a built in feature to say it's the expected output as the same as the actual output? If not, print an error, keep track of it, say what the difference was. This is built into most languages that you might use.
So the actual test could only be four lines. There is a catch, because the actual output of the script when I wrote this is that it's produced as this: Those two lines aren't supposed to be there. They are not ‑‑ they are not lengths of route objects, which is because I graph for Route‑6 anywhere in the line, not at beginning line and some people wrote description lines that like like this.
What do you do then? You take that weird data, put ‑‑ oh, this is the actual broken object or the object is fine, it just breaks my parser, this is a valid thing top do. And so what you do it, you extend the test data with the weird data like you do in the count 4 and if I run my test suite it will say I expect to see one /33 but it gave two lines so something is broken, then you go ahead and fix this, and you can keep doing this with everything you find that didn't work or kind of worked but not really and you know that this kind of bug will not reoccur. This is a very simple example because I have only a few minutes and it needs to fit on slides, but there isn't really that much more to it on bigger projects. It's essentially the same thing. Architecture becomes a thing that you may need to work more on, that's always the things with bigger projects. Fundamentally this is it. And it looks simple because it is not that hard, but it is an immense difference for a product like IRD which is about 10,000 lines of Python, we ‑‑ this is the only reason that it worked. If I didn't have a test suite the ‑‑ this would have ended poorly.
If you have tests you might also consider continuous integration. I wouldn't do it at this skill but this used to be ‑‑ it's basically every time you commit, the tests run and it will tell you whether your current committed version is working, whether it's valid. It used to be a horrible thing to configure, it would take someone who is experienced a whole week and would randomly break. Now you can just get someone else to host it for you and often for free. You will need to give it a little bit of configuration but this very quickly becomes worth it.
The next thing is readability, especially when I see people think, write a little script to fix this little problem, they tend to be very sloppy with the quality and readability of the code they write because they think it won't matter. I see often that when someone is this is a quick fix for something it is still there ten years later, it is more common that this is false than it is true, supposed to run for three weeks and they ended up being closed after five years.
Some good rules are, if you make code comments ‑‑ comment, not ‑‑ comment ‑‑ not the how actually, the why, mainly, why are you doing something, that is a good code comment. What is this code supposed to do? It's usually poor code comment, it probably means your code can be better. Not always but almost always.
And ask yourself how clear is this code to yourself in a year, if you don't remember the context, if you don't remember all the domain knowledge, if someone else is trying to do it because when you write it the image can be very clear in your head because you have all the parts in your head. But code gets read a lot more often than it is written so extra time spent on readability will save you time in the end. One of my favourite examples is one letter variable names, A is a terrible name, you can do better, each of you can. Last week I spent half an hour trying to figure out how something worked which actually meant different things in different parts of the same function. Other people have done this before me.
You can see it even with the little script I wrote before, if I just flip it to one letter variable names, instantly this has become a harder to read thing and it is eight lines.
Next is rep factoring, when we restructure our code but don't actually change the behaviour or very little at least. Sometimes you do this because you made a bad design choice in the past, because business requirements changed in a way that wasn't anticipated. It happens in short cycles so when I build something and then I want to commit it or merge it or release it, I always took over my code for readability because sometimes, especially when it's complex, I sort of first dump out whatever was in my brain but how to solve this, then I take a break and look at it again and usually it is not the best I can do. But also in longer cycles when you have had something for a while and it doesn't meet your needs any more, it is something essentially in all software, and exponentially important as it gets bigger. For example, IRD shows a framework for process management which is not going to work for the future, it will block us in future development so I am going to have to take out that whole thing and replace it it with something else which is quite a task, but this is my refactoring also requires not to be scared of change, I have a very sold test suite so I am not worried about doing that even though it's changing a whole part of the architecture, I am very ‑‑ I have a lot of trust in that it will work.
And so this also, it's something you are going to need to test for, it is terrifying because you don't know what you are breaking and sometimes it's the parts you didn't expect.
Being able to refactor also means not getting too attached to your code, especially new developers and people who don't have, yes, don't have a lot of experience, I see this, code is a volatile temporary thing in some cases. A while ago I worked for two days on a feature and then I threw everything away because I took the wrong approach and I got stuck and this is a normal thing to happen, it's not always two days, I learned on the way to do it so then it's pretty easy but throwing work away is something that will happen a lot. Throwing away the code you worked very hard on last year because it is not the right choice any more today, is really common. And when people do get too attached I start seeing things in code bases, this happens a lot, like we don't use this function any more and then there is commented out a 300 function from four years ago. If you have source control and version control there is no need to do this. If you don't need the code, delete it. Also things like I think this is broken so I have disabled it and they have commented out ten lines of gibberish which doesn't actually compile. Most of the older code bases I work on have things like this and the ‑‑ in the last ten years, the number of cases where I have been like oh this is so nice, like past developer from years ago that you left this bit of broken code in there for me to read has been zero. This has never, ever been useful. This is a thing like people are I worked so hard on this, I don't want to delete it it, all this commented out.
So, over‑engineering is also something I see, especially in newer people. It is also a tricky one, as projects get more complicated it can be useful to have participate on future needs but a lot of assumptions you make about wrong, what this software will do in three years is probably not going to stay. So and also the further away, the less likely this is going to happen. And so this is why it's something to be careful with, a lot of expectations about the future of your software are going to be wrong and it's more important to focus on requirements now. There is a balance, but it's something that you have to learn with experience to a degree but when in doubt I tend to default to what do we actually need now rather than how do we make this compatible with the requirements that we have two years from now.
Also, all software needs at least some documentation, even the tiny script I just showed and a good place to start if you have nothing at all is a read me, a product like IRD needs a lot more and but there is also a read me. And it covers things like what does this product actually do? Why did we write it, who wrote it? How does it interact? Is there more documentation? If so, where can someone find it? Are there any known limitations of this project? So for the script I just showed you, you could say this script generates stats on the number of Route‑6 objects for each prefix, the first number of objects and second being prefix length, this was developed by me in 2019 to service input for some kind of process. There is a test script. And you could say the output is based on RIPE data split dump which might be 24 hours and it doesn't support IPv4 ‑‑ who uses IPv4 anyways? Lastly, it's also important to know your limitations if you have very limited experience and limited access to mentoring or training, there are limits to what you will write well. The same way that I can probably still pretty quickly figure out to set up a BGP peering, but if I were in charge on my own of a large network, everyone would have a very bad time. So a good example is this: You don't have been to be able toll read this. This is a patch that a client submitted after finding a bug in a project, and which was a file permission that was set incorrectly. This patch is entirely correct, the code is in the right place, it uses standard Python APIs, the test succeed and it correctly fixes the problem so this is at first sight a good patch. The one I committed was this /H‑RBGS which you definitely cannot read. And so the patch was correct, this was submitted by a client who is not an experienced software engineer, like they know Python, they correctly wrote that patch but my thinking when I see this is, do we have other places in the code base where the same bug may exist? And this is a bug that someone noticed in production so that sounds like something that is definitely a bug that should have a test so I had a test for this particular case and also another case which might be similar, and between behaviours so that means it might be worth adding to the documentation that this is now intended explicit be /HAEUPier. Fundamentally this difference is experience, together the other thing that is ‑‑ comes with ‑‑ that basically needs a lot of experience is architecture design, also something that I can give directions but like, to a degree, that is something that you need with experience. But, with the tips that I gave you, you can already make a lot of big steps in improving things and actually do that more and you will gain that experience and get better at this.
There are also ‑‑ there are a lot of other parts, like this is only half an hour so there is limits to how much I can cover. One of it being an independent contractor also means you get to deal with the human side of things, like having a client ask you can you add Google to this product as in can you reimplement Google and put it in the product they made for us and trying to figure out what they want. So, there is a lot of things around this that you will actually encounter, but I think with this you have a good start to improve the code that you write yourself, especially those small things that are probably just there temporarily, that you are just doing as a quick hack that will definitely not be around in another five years, except they will.
Thank you so much.
MARIA ISABEL GRANDIA: Thank you, that was quite clear. Are there any questions for her?
ANNA WILSON: Cool talk, thank you very much. Among network engineers some of us are totally on board and we do need to automate more and use more software engineering techniques and some of us really feel too busy taking care of what is already there by handling the techniques we know. Any advice for moving people from the second category to the first?
SASHA ROMIJN: That is a difficult one. You see this in software engineers out there, tests are useless and it just takes time. It is ‑‑ so for one thing, it is also a management thing if you work in an organisation that they should be making space for that. You are going to need it in the long‑term. Generally what I see in network engineering is also an increased trend of automation and so you are just going to be left behind if you don't do this, you are going to deliver poorer work. So it is something you will have to make time for and it is something that you ‑‑ your management would have to consider a priority, to release some time of the day‑to‑day grind to explore new things. Which goes for any profession at all, really, not even engineering.
ANNA WILSON: Thank you.
MARIA ISABEL GRANDIA: Any more questions? So thank you, Sasha.
Our next speaker is Willem Toorop‑‑ he is going to talk about DNS, if it's five minutes presentation it's five minutes for question,
WILLEM TOOROP: Yes. So this talk about some very rough ideas about how to make accessible this power of DNS measurement information I have been ‑‑ it's also the result of discussions I had with Emile Aben and Jasper den Hertog because they want this information ‑‑ this data to be available for the biggest possible audience to you, like you. And with this presentation, I basically want to ask you to tolerate the DNSKEness terminology a little bit and see if there is something here that could be of use for you and how that could be best of use, say.
But the ‑‑ I do not want feedback after the presentation probably because I have 20 slides, so I have no room for questions.
So this all started with the DNS measurement hackathon that was organised by the RIPE in April 2017. I participated with ‑‑ in a hack called the DNSThought in group, together with Andrea and pet Ross and Gerry and and our idea was to have create a portal that would incentivise all the properties and capabilities of all the resolvers that were in use by Atlas probes. So, capabilities and properties of those resolvers are, for example, can they do IPv6, do they do TCP? Do they do DNSSEC validation? Do they rewrite not accessed names and do they do Qname minimisation, those kind of things? So you can measure those things by sending queries, sometimes by just sending queries for normal zones, like if you have a zone on IPv6 only address, then only IPv6 capable resolver can reach it. But sometimes you need the authoritative perspective too, for example, by ‑‑ with Qname minimisation if the resolver is using the smallest possible label to ask a question.
So, there was also part of the project, was the DNSThought Daemon that is an authoritative name server which gives specific answers or inventories, properties about the queries it sees and then replies accordingly. So here you have DNS thought replying on ‑‑ over TCP only and also, when it will reply, it will always mention the IP address of the resolver that it saw at the authoritative, and that's quite unique, I think.
So that is the property of the RIPE Atlas. So with RIPE Atlas, at the RIPE Atlas probe, you do see ‑‑ you know the IP address of the RIPE Atlas probe, this also the network in which it resides, but you can also see the IP addresses of the resolvers it uses, so you can see is it within the same network as the probe or is it in a different network, and also, with the DNSThought Daemon we also see the IP address which is getting out of the DNS Cloud. So what is happening in the Internet we don't know but we know with which IP address it's getting in there and with which IP address it's getting out there.
This is quite interesting because, based on this, we can sort of say this probe uses ‑‑ is completely internal, right? This is what we see a lot with ISPs. The probe is in an ASN, the resolver is in the same ASN at the authoritative, we see the IP address coming out of the DNS Cloud, everything the same, in the same ASN. So a probe which is forwarding or ‑‑ a resolver which is forwarding, is in the same ASN as probe, but then it comes out at the authoritative from another ASN. External, that's the resolver configures on the probe is not in the same ASN as the probe.
So, at the hackathon we had, I think, almost 10 measurements that were scheduled by Emile Aben to run forever, for every possible Atlas probe, every hour, for every resolver that I have. We had a little portal which mentioned those properties for the separate probes. Then, later, in 2017 I joined forces with Roland and we added a lot of measurements measuring all the different DNSSEC algorithms for the projects. Then in 2018 I joined forces with Moritz Miller and we added measurements to measure route KSK and DNSKEY roll‑over, and so currently, so it has been growing more and more measurements have been added and currently there are 88 measurements performed on all the resolvers out probes on RIPE Atlas every hour. And the ‑‑ all the measurements are listed on this link, DNS thought and NLnet Labs, and the results of those measurements interpreted ‑‑ generate plots for them. For example, this plot is Deutsche Telekom, showing which have internal resolvers, which have forwarding and which have external, if you think about the ‑‑ there is external resolvers are Atlas probes that have a specific different resolver configured than what Deutsche Telekom gave them.
So these results have been used in scientific papers and bigger DNS operators, such as Google use this platform but how can this be of use to you?
So, in discussion with Emile and Jasper we thought of a few things. One thing we thought could be useful is community tags in RIPE Atlas, for example, to schedule some sort of measurement with other probes which have a resolver which exits on ASN 16159 or 1103 or a schedule measurement to all the probes that have at least one resolver that does Qname minimisation. But then, we thought, okay, ed448, Qname minimisation, what does ‑‑ what do all those properties mean? So we thought we would need to have some sort of mechanism to make it more meaningful, so you can actually know what you are querying. So, we thought about some sort of system to score those or use those properties to score ‑‑ to star specific capability of the category of the resolver, so how is this resolver doing security‑wise, how is it doing privacy‑wise and performance‑wise? So for example, for security, you would get 60% of the five stars already if you are doing DNSSEC validation, which is the main security feature of DNS, and then 3% for each additional algorithm that resolver supports and then 10% for trust anchors and 10% for not doing NX domain. Similar schemes for privacy, performance, compliance, and then we thought it would be nice to have ‑‑ oh, we have proof of concept occurred on this GitHub repository, we think it might be interesting if you could make your own rating scores for this it, perhaps you could have a different category like environment or friendliness and present everything in some sort of portal that would present this for your resolver based on the resolver which is detected and this is my last slide. I would like to have your feedback, but in the hallway. Thank you.
PETER HESSLER: Thank you very much. Next up in our lightning talks is Vasileios Kotronis, who is going to talk about ARTEMIS.
VASILEIOS KOTRONIS: This is about ARTEMIS which is an open source tool funded by projects, and I would like to give a short update on what it does and what it is.
So ARTEMIS is able to detect BGP hijacks in realtime, monitoring, detection and mitigation, we receive two main sources of information, one is a set of streaming sources like RIS life which are publicly available or local sources like interfaces to your local route collectors. And the other source of information is the ground route from the operator. So the job of the module is to record incoming updates. The detection compares updates with grounds from the operator and the mitigation eventually can react to ongoing hijack.
So, let's give an example for this. The operator states that he owns /22, he announces from two sources and has a single upstream in v6. Now, let's say assume that AS 4 provides announcement for /33, that is not valid so it's from wrong prefix. So the monitoring system will probably catch it if there is enough visibility for that, and usually there is. The detection will argue that this perpetrated by AS 4 versus sub‑prefix and can proceed to further steps.
So focusing a bit on the ARTEMIS tool, besides this three modules I mentioned before we have persistent and persistent storage to keep state regarding both BGP updates and hijack alerts and interface with which the operator can interact with the system. Configure it, see the updates and the hijacks. And everything is routed over scaleable message path.
So, regarding the features of the tool, it's available on Git hub if you like to download it. It will state here because of time the primary features, it can function in different modes, for example, as a passive detector or active user triggered mitigator or such events. Support local installations and clusters and both IPv4 and v6, among others.
So to be able to reason about hijacks and of what's going on regarding ongoing attack, we classify them in four distinct dimensions, and relate to what other hijacker can do it for example, on the prefix, can monitor the prefix, the path itself, the packets of the traffic that he can receive and, finally, violate some kind of policy, like port leak event.
I have of a demo now that will happen on Wednesday and Thursday and I can show some more examples that will happen there.
Also hijack can pass through several states through its lifetime. When it begins it's an ongoing event. If, let's say, we see no update for X hours, defined by the user, we consider it dormant but still somewhat active. And then the user has the option to essentially press some buttons and either trigger a mitigation procedure, ignore it or resolve it manually. There are automation tags like the withdrawn and outdated ones that examine if all withdrawals have been received for this hijack, and it can tag automatically the alert accordingly.
In the heart of the tool you have the configuration file and can be automated in its generation. So, essentially here the user defines prefixes, ASNs and monitors to be used and most importantly, he defines the ARTEMIS rules can include routing policy. For example, my results ‑‑ this origin ASs can originate this prefix, to the upstreams and if something happens please mitigate manually. The demo I will show for different kind of events.
So, here I have some slides where I have used the BGP peering test‑bed. So peering has about 8 to 10 locations around the world, I am using two of them, one in California and one in Greece, so I will advertise something from California with a valid origin and a valid neighbour and then I will test something from Greece with a valid origin but affect neighbour, something not in the configuration. So, this will not be caught by a PI, because it's false and not the origin. The prefix is also fine.
So, first of all we start configure the tool, no hijacks for the time, and the right side you see some information about the system itself and some statistics.
Here the configuration tile and the module controller so I am using the tool right now as a passive projector, I have not enabled communication ‑‑ and I have stated everything that is originated by this peering ASN, so 47065 and with this upstream to the 6 is valid for this prefix. Now, if I make a normal announcement with this upstream, ARTEMIS will see the updates. Here the service is the RIPE RIS life, what I am using. Nothing happens so it's all legal and valid. I am doing an announcement from the other side, on the other side of the world, I am using a fake neighbour, so here, it will see updates what they have initiated hijack alert. So, it would be into there, the alert, you can have more information like for example who is, let's say the validing AS, the type of the hijack, when it started and when was it detected and so on. For example, in this particular instance, we see that the time between origination and detection is about six seconds so we can have almost time alerting with this system.
Let's say we communicate with perpetrator and he starts withdrawing the prefix, so when every sing sell monitor that has seen the hijack updates has also seen a withdrawal, we consider the hijack as ended and automatically tagged our sites.
So, regarding next steps for the tool, we are looking into auto configuration, so generating this file automatically using either Ansible and Python or RPKI ROAs.
The next is auto mitigation, so how multiple ways to do that and where we would like feedback from operators on what is the most friendly way to do that in the network. Also, looking at the data plane extensions, of RIPE Atlas trace routes and keep maintaining a testing kit and we are already deployed it in AMS‑IX and in the US.
We would like to demo that on Wednesday and Thursday. Different hijacks types and different angsts you can do on the hijacks and some automation features but the tool offers. And with this, I would like to conclude this fast talk. Thank you very much.
MARIA ISABEL GRANDIA: Thank you, we have one minute left if you want to have one question? No. Thank you.
And our last speaker today is Wouter van Diepen, going to talk about 100 gig multiplexing.
WOUTER VAN DIEPEN: Thanks for pronouncing my name correctly. The talk about is future of passive multiplexing and multiplexing beyond 100G ones.
You are all network engineers and probably your boss is saying you should do a bit of DWDM and C on the side as well, going from 1G to 10G was easy, swapping the op text and everything was fine but what happens after 10G? There is no DWDM 100G or 25G, why is that and how does that work?
Well, for multiplexing you have got three ingredients, because it's a lightning talk I will skip, the Dark Fiber, Multiplexer and Light and Transceiver. Look at the Dark Fiber, two things that can restrict light over Dark Fiber, the first is attention attenuation is the weakening of light over the fibre due to the distance. The second one is dispersion, when two waves are overlapping each other and that's because of the higher frequency. If you are at 1 gig you have dispersion up to 20 kilometres or 200 kilometres, no problem. 10G, 80 kilometres more or less the end and if you go to 25G or 100G it's even worse.
So, take these two in account, you have to also think of the wavelength. The attenuation at 50/50 is the most favourable. So you will have a loss of 0.257 per kilometre. If you go into the 1310 range you are having 0.35 DB per kilometre. Dispersion, it's the other way around, so you hardly have any dispersion at 3010 but you do have a high dispersion at 1550. So, there is no perfect colour or wavelength you can transmit in over regular fibre.
Taking this into account, we go to the third ingredient, I will skill the multiplexer, that is for a longer presentation. If we go to light and transceivers, you all know the CWDM and DWDM, there is a third one getting more popular right now and that is LWDM, around 1310 and there is no dispersion there, so the higher frequencies, 100G, 25, 400, will all work in this frequency band. You have higher attenuation, you don't have the dispersion but you have the higher attenuation.
So, there will be just regular passive muxes and regular optics operating in this LWDM band. You are go up 8 times 25G, up 407 kilometres with regular optics.
In the future, you will also have 100G optics, going up to around 15 minutes, I think up to 40, this is used a lot in China and Korea, it's the cheapest way to go more than 10G over your fibre, multiplex. Isn't there anything else? Well if you want to to higher frequency in the DWDM band you know modulation for coherent 100G. It's getting more information per pulse of light. You can see that clearly here where you have the top one, you have the state of zero and one, and if you put that through a DSP chip, you can have multiple stages that is modulation. The disadvantage of this is, you need a lot of power to do that modulation and, for example, a CFP 2‑DC0 need 20 watts and you ‑‑ everybody is working with QSFB 28 and the max is 4.5 watts that cannot be incorporated inside of a QSFB 28.
So for this solution you need an active box to a QSFP to a CFP2‑DCO. There is a solution, that is working with PA M4 modulation and that does fit in a QSFP 28 but it needs dispersion compensator and EDFA to amplify because there is simply not enough power beside the modulation inside of the QSFP 28. We made a box for this, which I will show you also in the coffee break. We have all all incorporated, dispersion compensator all in one.
So, here we are looking at the three solutions there are, if you want to do more than 10, 25G or multiplex 100G. The first is LWDM, is the cheapest one, you can grab a passive muxer for €600.,the optics are €400, so you can multiplex easily multiple 25G in the future to 100G optics.
Second solution is the DWDM QSFP 28, you need a special box and around 6 K, the optics around 3K and then you can multiplex multiple 100 gigs on your fibre.
The third solution is CFP 2‑DCO solution. There is a box in the box in the middle, you need something going from that and bring that to your multiplexer. The solution will start around 17K, 20K.
These are the solutions if you want to do different or you wanton multiplex 100G solutions.
We will do a workshop on coming Thursday so you are all invited, Thursday at 6:00. We will built a little with eight times 100G so there is enough to fail as well, so come and join and see how we all fix that. So we will get some Cisco, some Dell and some other brands and put all the optics in and real fibre in place and configure all the boxes, so just come on Thursday.
That was it.
PETER HESSLER: Okay, thank you very much. Do you have any questions? We still have a few minutes.
MARIA ISABEL GRANDIA: Hi, Walter. How does the cold effect, the temperature affect?
WOUTER VAN DIEPEN: The temperature does affect the fibre, you can see when fibre freezes you will have more attenuation, but for the rest, there is no big difference. What you see in the object particular if it heats up there is always a temperature fluctuation but the muxes and the deviser all counted for that. There is a temperature controling unit in the DWDM to counter that shift of it.
MARIA ISABEL GRANDIA: Thank you.
PETER HESSLER: All right. Thank you very much.
And so just before we end the plenary session for today, a few quick little reminders. Starting at 4 o'clock, there is the BCOP task force here in this room, also in the side room, which you can get going around to the other side of this wall, is the BoF or NOGs for interested parties. We have open slots available for lightning talks on Friday morning session so if you would like to submit send in your lightning talk suggestions. Other than that, we look forward to seeing you in the BCOP or the BoF and then look forward to see you at the social tonight.
LIVE CAPTIONING BY AOIFE DOWNES, RPR